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Description 

[0001] This invention relates to recombinant toxin fragments, to DNA encoding these fragments and to their uses 
such as in a vaccine and for in vitro and in vivo purposes. 

5 [0002] The clostridial neurotoxins are potent inhibitors of calcium-dependent neurotransmitter secretion In neuronal 
cells. They are currently considered to mediate this activity through a specific endoproteolytic cleavage of at least one 
of three vesicle or pre-synaptic membrane associated proteins VAMP, syntaxin or SNAP-25 which are central to the 
vesicle deciding and membrane fusion events of neurotransmitter secretion. The neuronal cell targeting of tetanus and 
botulinum neurotoxins is considered to be a receptor mediated event following which the toxins become internalised 

10 and subsequently traffic to the appropriate intracellular compartment where they effect their endopeptidase activity. 
[0003] Inhibition of calcium-independent [^H] noradrenaline outflow from freeze-thawed synaptosomes has also been 
reported (Hausinger, A. et al 1995, Toxicon, vol. 33, No. 11, p. 1519 - 1530) 

[0004] The clostridial neurotoxins share a common architecture of a catalytic L-chain (LC, ca 50 kDa) disulphide 
linked to a receptor binding and translocating H-chain (HC, ca 100 kDa). The HC polypeptide is considered to comprise 

15 all or part of two distinct functional domains. The carboxy-terminal half of the HC (ca 50 kDa), termed the Hq domain, 
is involved in the high affinity, neurospecific binding of the neurotoxin to cell surface receptors on the target neuron, 
whilst the amino-terminal half, termed the Hn domain (ca 50 kDa), is considered to mediate the translocation of at least 
some portion of the neurotoxin across cellular membranes such that the functional activity of the LC is expressed within 
the target cell. The domain also, has the property, under conditions of low pH, of forming ion-permeable channels 

20 in lipid membranes, this may in some manner relate to its translocation function. 

[0005] For botulinum neurotoxin type A (BoNT/A) these domains are considered to reside within amino acid residues 
872-1296 for the H^, amino acid residues 449-871 for the Hfg and residues 1-448 for the LC. Digestion with trypsin 
effectively degrades the He domain of the BoNT/A to generate a non-toxic fragment designated LH^. which is no longer 
able to bind to and enter neurons (Fig. 1 ). The LH^ fragment so produced also has the property of enhanced solubility 

25 compared to both the parent holotoxin and the isolated LC. 

[0006] It is therefore possible to provide functional definitions of the domains within the neurotoxin molecule, as 
follows; 

(A) clostridial neurotoxin light chain: 

30 

a metalloprotease exhibiting high substrate specificity for vesicle and/or plasma - membrane associated pro- 
teins involved in the exocytotic process. In particular, it cleaves one or more of SNAP-25, VAMP (synaptobrevin 
/ cellubrevin) and syntaxin. A single mutation (Glu224 ^ Ala234) in the light chain of tetanus toxin has been 
shown to abolish proteolytic activity (Li. Y. et at. Biochemistry 1994, 33, p. 7014 - 7020). 

35 

(B) clostridial neurotoxin heavy chain H^ domain: 

a portion of the heavy chain which enables translocation of that portion of the neurotoxin molecule such that 
a functional expression of light chain activity occurs within a target cell. 

40 

the domain responsible for translocation of the endopeptidase activity, following binding of neurotoxin to its 
specific cell surface receptor via the binding domain, into the target cell. 

the domain responsible for formation of ion-permeable pores in lipid membranes under conditions of low pH. 

45 

the domain responsible for increasing the solubility of the entire polypeptide compared to the solubility of light 
chain alone. 

(C) clostridial neurotoxin heavy chain H^ domain. 

50 

a portion of the heavy chain which is responsible for binding of the native holotoxin to cell surface receptor(s) 
involved in the intoxicating action of clostridial toxin prior to intemalisation of the toxin into the cell. 

[0007] The identity of the cellular recognition markers for these toxins is currently not understood and no specific 
55 receptor species have yet been identified although Kozaki et al. have reported that synaptotagmin may be the receptor 
for botulinum neurotoxin type B. It is probable that each of the neurotoxins has a different receptor. 
[0008] It is desirable to have positive controls for toxin assays, to develop clostridial toxin vaccines and to develop 
therapeutic agents incorporating desirable properties of clostridial toxin. 
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[0009] For example, WO96/12802 describes vaccine compositions comprising the Hq region of C. botulinum. 
[0010] However, due to its extreme toxicity, the handling of native toxin is hazardous. 

[0011] The present invention seeks to overcome or at least ameliorate problems associated with production and 
handling of clostridial toxin. 

[0012] Accordingly, the invention provides a polypeptide as defined in Claim 1 . 

[0013] The invention may thus provide a single polypeptide chain containing a domain equivalent to a clostridial toxin 
light chain and a domain providing the functional aspects of the of a clostridial toxin heavy chain, whilst lacking the 
functional aspects of a clostridial toxin domain. 

[0014] For the purposes of the invention, the functional property or properties of the of a clostridial toxin heavy 
chain that are required to be exhibited by the second domain of the polypeptide of the invention are either (i) translo- 
cation of the polypeptide into a cell, or (ii) increasing solubility of the polypeptide compared to solubility of the first 
domain on its own or (iii) both (i) and (il). References hereafter to a Hjsj domain or to the functions of a domain are 
references to this property or properties. The second domain is not required to exhibit other properties of the domain 
of a clostridial toxin heavy chain. 

[0015] A polypeptide of the invention can thus be soluble but lack the translocation function of a native toxin-this is 
of use in providing an immunogen for vaccinating or assisting to vaccinate an individual against challenge by toxin. In 
a specific embodiment of the invention described in an example below a polypeptide designated LH423/A elicited neu- 
tralising antibodies against type A neurotoxin. A polypeptide of the invention can likewise thus be relatively insoluble 
but retain the translocation function of a native toxin - this is of use if solubility is imparted to a composition made up 
of that polypeptide and one or more other components by one or more of said other components. 
[001 6] The first domain of the polypeptide of the invention cleaves one or more vesicle or plasma-membrane asso- 
ciated proteins essential to the specific cellular process of exocytosis, and cleavage of these proteins results in inhibition 
of exocytosis, typically in a non-cytotoxic manner. The cell or cells affected are not restricted to a particular type or 
subgroup but can include both neuronal and non-neuronal cells. The activity of clostridial neurotoxins in inhibiting 
exocytosis has indeed, been observed almost universally in eukaryotic cells expressing a relevant cell surface receptor, 
including such diverse cells as from Aplysia (sea slug). Drosophila (fruit fly) and mammalian nerve cells, and the activity 
of the first domain is to be understood as including a corresponding range of cells. 

[0017] The polypeptide of the invention may be obtained by expression of a recombinant nucleic acid, preferably a 
DN A, and is a single polypeptide, that is to say not cleaved into separate light and heavy chain domains. The polypeptide 
is thus available in convenient and large quantities using recombinant techniques. 

[0018] in a polypeptide according to the invention, said first domain preferably comprises a clostridial toxin light chain 
or a fragment or variant of a clostridial toxin light chain. The fragment is optionally an N-terminal, or C-terminal fragment 
of the light chain, or is an internal fragment, so long as it substantially retains the ability to cleave the vesicle or plasma- 
membrane associated protein essential to exocytosis. The minimal domains necessary for the activity of the light chain 
of clostridial toxins are described in J. Biol. Chem.. Vol.267, No. 21, July 1992, pages 14721-14729. The variant has 
a different peptide sequence from the light chain or from the fragment, though it too is capable of cleaving the vesicle 
or plasma-membrane associated protein. It is conveniently obtained by insertion, deletion and/or substitution of a light 
chain or fragment thereof. In embodiments of the invention described below a variant sequence comprises (i) an N- 
terminal extension to a clostridial toxin light chain or fragment (ii) a clostridial toxin light chain or fragment modified by 
alteration of at least one amino acid (iii) a C-terminal extension to a clostridial toxin light chain or fragment, or (iv) 
combinations of 2 or more of (i)-(iii). 

[0019] In further embodiments of the invention, the variant contains an amino acid sequence modified so that (a) 
there is no protease sensitive region between the LC and components of the polypeptide, or (b) the protease 
sensitive region is specific for a particular protease. This latter embodiment is of use if it is desired to activate the 
endopeptidase activity of the light chain in a particular environment or cell. Though, in general, the polypeptides of the 
invention are activated prior to administration. 

[0020] The first domain preferably exhibits endopeptidase activity specific for a substrate selected from one or more 
of SNAP-25, synaptobrevinA/AMP and syntaxin. The clostridial toxin is preferably botulinum toxin or tetanus toxin. 
[0021] In an embodiment of the invention described in an example below, the toxin light chain and the portion of the 
toxin heavy chain are of botulinum toxin type A. In a further embodiment of the invention described in an example 
below, the toxin light chain and the portion of the toxin heavy chain are of botulinum toxin type B. The polypeptide 
optionally comprises a light chain or fragment or variant of one toxin type and a heavy chain or fragment or variant of 
another toxin type. 

[0022] In a polypeptide according to the Invention said second domain preferably comprises a clostridial toxin heavy 
chain Hpg portion or a fragment or variant of a clostridial toxin heavy chain portion. The fragment is optionally an N- 
terminal or C-terminal or internal fragment, so long as it retains the function of the domain. Teachings of regions 
within the responsible for its function are provided for example in Biochemistry 1995, 34, pages 15175-15181 and 
Eur. J. Biochem, 1989, 185, pages 197-203. The variant has a different sequence from the domain or fragment. 
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though it too retains the function of the domain. It is conveniently obtained by insertion, deletion and/or substitution 
of a Hfg domain or fragment thereof, fn embodiments of the invention, described below, It comprises (i) an N-terminal 
extension to a Hjy, domain or fragment, (ii) a C-terminal extension to a H|s| domain or fragment, (ili) a modification to a 
domain or fragment by alteration of at least one amino acid, or (iv) combinations of 2 or more of (iHlli). The clostridial 
toxin is preferably botulinum toxin or tetanus toxin. 

[0023] The invention also provides a polypeptide comprising a clostridial neurotoxin light chain and a N-terminal 
fragment of a clostridial neurotoxin heavy chain, the fragment preferably comprising at least 423 of the N-terminal 
amino acids of the heavy chain of botulinum toxin type A, 417 of the N-terminal amino acids of the heavy chain of 
botulinum toxin type B or the equivalent number of N-terminal amino acids of the heavy chain of other types of clostridial 
toxin such that the fragment possesses an equivalent alignment of homologous amino acid residues. 
[0024] These polypeptides of the invention are thus not composed of two or more polypeptides, linked for example 
by di-sulphide bridges into composite molecules. Instead, these polypeptides are single chains and are not active or 
their activity is significantly reduced in an in vitro assay of neurotoxin endopeptidase activity. 

[0025] Further, the polypeptides may be susceptible to be converted into a form exhibiting endopeptidase activity by 
the action of a proteolytic agent, such as trypsin. In this way it is possible to control the endopeptidase activity of the 
toxin light chain. 

[0026] In a specific embodiment of the invention described in an example below, there is provided a polypeptide 
lacking a portion designated Hq of a clostridial toxin heavy chain. This portion, seen in the naturally produced toxin, is 
responsible for binding of toxin to cell surface receptors prior to intemalisatron of the toxin. This specific embodiment 
is therefore adapted so that It can not be converted Into active toxin, for example by the action of a proteolytic enzyme. 
The Invention thus also provides a polypeptide comprising a clostridial toxin light chain and a fragment of a clostridial 
toxin heavy chain, said fragment being not capable of binding to those cell surface receptors involved in the intoxicating 
action of clostridial toxin, and it is preferred that such a polypeptide lacks an intact portion designated of a clostridial 
toxin heavy chain. 

[0027] In further embodiments of the invention there are provided compositions containing a polypeptide comprising 
a clostridial toxin light chain and a portion designated H|sj of a clostridial toxin heavy chain, and wherein the composition 
Is free of clostridial toxin and free of any clostridial toxin precursor that may be converted into clostridial toxin by the 
action of a proteolytic enzyme. Examples of these compositions include those containing toxin light chain and 
sequences of botulinum toxin types A, B, C, D, E, F and G. 

[0028] The polypeptides of the invention are conveniently adapted to bind to, or Include, a ligand for targeting to 
desired cells. The polypeptide optionally comprises a sequence that binds to, for example, an immunoglobulin. A suit- 
able sequence is a tandem repeat synthetic IgG binding domain derived from domain B of Staphylococcal protein A. 
Choice of Immunoglobulin specificity then determines the target for a polypeptide - Immunoglobulin complex. Alterna- 
tively, the polypeptide comprises a non-clostridial sequence that binds to a cell surface receptor, suitable sequences 
including insulin-like growth factor-1 (IGF-1) which binds to its specific receptor on particular cell types and the 14 
amino acid residue sequence from the carboxy-terminus of cholera toxin A subunit which is able to bind the cholera 
toxin B subunit and thence to GM1 gangliosides. A polypeptide according to the invention thus, optionally, further 
comprises a third domain adapted for binding of the polypeptide to a cell. 

[0029] In a second aspect the invention provides a fusion protein comprising a fusion of (a) a polypeptide of the 
invention as described above with (b) a second polypeptide adapted for binding to a chromatography matrix so as to 
enable purification of the fusion protein using said chromatography matrix. It is convenient for the second polypeptide 
to be adapted to bind to an affinity matrix, such as a glutathione Sepharose, enabling rapid separation and purification 
of the fusion protein from an impure source, such as a cell extract or supernatant. 

[0030] One possible second purification polypeptide is glutathlone-S-transferase (GST), and others will be apparent 
to a person of skill in the art, being chosen so as to enable purification on a chromatography column according to 

conventional techniques. 

[0031] As noted above, by proteolytic treatment, for example using trypsin, of a polypeptide of the invention it is 
possible to induce endopeptidase activity in the treated polypeptide. 

[0032] A third aspect of the invention provides a composition comprising a polypeptide of the present invention, said 
composition being non-toxic in wVo. The overall endopeptidase activity of the composition will, of course, be determined 
by the amount of the polypeptide that is present. 

[0033] While it is known to treat naturally produced clostridial toxin to remove the domain, this treatment does 
not totally remove toxicity of the preparation, Instead some residual toxin activity remains. Natural toxin treated In this 
way is therefore still not entirely safe. The composition of the invention, derived by treatment of a pure source of 
polypeptide advantageously is free of toxicity, and can conveniently be used as a positive control in a toxin assay, as 
a vaccine against clostridial toxin or for other purposes where it is essential that there is no residual toxicity in the 
composition. 

[0034] The invention enables production of the polypeptides and fusion proteins of the Invention by recombinant 
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means. 

[0035] A fourth aspect of the invention provides a nucleic acid encoding a polypeptide or a fusion protein according 
to any of the aspects of the invention described above. 

[0036] In one embodiment of this aspect of the invention, a DNA sequence provided to code for the polypeptide or 
fusion protein is not derived from native clostridial sequences, but is an artificially derived sequence not preexisting in 
nature. 

[0037] A specific DNA (SEQ ID NO: 1) described in more detail below encodes a polypeptide or a fusion protein 
comprising nucleotides encodlhg residues 1 -871 of a botulinum toxin type A. Said polypeptide comprises the light chain 
domain and the first 423 amino acid residues of the amino terminal portion of a botulinum toxin type A heavy chain. 
This recombinant product is designated LH423/A (SEQ ID NO: 2). 

[0038] In a second embodiment of this aspect of the invention a DNA sequence which codes for the polypeptide or 
fusion protein Is derived from native clostridial sequences but codes for a polypeptide or fusion protein not found in 
nature. 

[0039] A specific DNA (SEQ ID NO: 19) described in more detail below encodes a polypeptide or a fusion protein 
and comprises nucleotides encoding residues 1-1171 of a botulinum toxin type B. Said polypeptide comprises the light 
chain domain and the first 728 amino acid residues of the amino terminal protein of a botulinum type B heavy chain. 
This recombinant product is designated LH728/B (SEQ ID NO: 20). 

[0040] The invention thus also provides a method of manufacture of a polypeptide comprising expressing in a host 
cell a DNA according to the third aspect of the invention. The host cell is suitably not able to cleave a polypeptide or 
fusion protein of the invention so as to separate light and heavy toxin chains; for example, a non-clostridial host. 
[0041] The invention further provides a method of manufacture of a polypeptide comprising expressing in a host cell 
a DNA encoding a fusion protein as described above, purifying the fusion protein by elution through a chromatography 
column adapted to retain the fusion protein, eluting through said chromatography column a ligand adapted to displace 
the fusion protein and recovering the fusion protein. Production of substantially pure fusion protein Is thus made pos- 
sible. Likewise, the fusion protein is readily cleaved to yield a polypeptide of the invention, again in substantially pure 
form, as the second polypeptide may conveniently be removed using the same type of chromatography column. 
[0042] The LHjvi^A derived from dichain native toxin requires extended digestion with trypsin to remove the C-terminal 
half of the heavy chain, the domain. The loss of this domain effectively renders the toxin inactive in vivo by preventing 
its interaction with host target cells. There is, however, a residual toxic activity which may indicate a contaminating, 
trypsin insensitive, form of the whole type A neurotoxin. 

[0043] In contrast, the recombinant preparations of the invention are the product of a discreet, defined gene coding 
sequence and can not be contaminated by full length toxin protein. Furthermore, the product as recovered from E.coli, 
and from other recombinant expression hosts, is an Inactive single chain peptide, or If expression hosts produce a 
processed, active polypeptide it is not a toxin. Endopeptldase activity of LH423/A, as assessed by the current in vitro 
peptide cleavage assay, is wholly dependent on activation of the recombinant molecule between residues 430 and 
454 by trypsin. Other proteolytic enzymes that cleave between these two residues are generally also suitable for ac- 
tivation of the recombinant molecule. Trypsin cleaves the peptide bond C-termlnal to Arginine or C-termlnal to Lysine 
and Is suitable as these residues are found in the 430-454 region and are exposed (see Fig. 12). 
[0044] The recombinant polypeptides of the invention are potential therapeutic agents for targeting to cells expressing 
the relevant substrate but which are not implicated in effecting botulism. An example might be where secretion of 
neurotransmitter is inappropriate or undesirable or alternatively where a neuronal cell is hyperactive in terms of regu- 
lated secretion of substances other than neurotransmitter. In such an example the function of the Hq domain of the 
native toxin could be replaced by an alternative targeting sequence providing, for example, a cell receptor ligand and/ 
or translocation domain. 

[0045] One application of the recombinant polypeptides of the invention will be as a reagent component for synthesis 
of therapeutic molecules, such as disclosed in WO-A-94/21300. The recombinant product will also find application as 
a non-toxic standard for the assessment and development of in vitro assays for detection of functional botulinum or 
tetanus neurotoxins either in foodstuffs or in environmental samples, for example as disclosed in EP-A-0763131. 
[0046] A further option is addition, to the C-terminal end of a polypeptide of the invention, of a peptide sequence 
which allows specific chemical conjugation to targeting llgands of both protein and non-protein origin. 
[0047] In yet a further embodiment an alternative targeting ligand is added to the N-terminus of polypeptides of the 
invention. Recombinant LHf^ derivatives have been designated that have specific protease cleavage sites engineered 
at the C-terminus of the LC at the putative trypsin sensitive region and also at the extreme C-terminus of the complete 
protein product. These sites will enhance the activational specificity of the recombinant product such that the dichain 
species can only be activated by proteolytic cleavage of a more predictable nature than use of trypsin. 
[0048] The LH^, enzymatically produced from native BoNT/A is an efficient immunogen and thus the recombinant 
form with its total divorce from any full length neurotoxin represents a vaccine component. The recombinant product 
may serve as a basal reagent for creating defined protein modifications in support of any of the above areas. 
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[0049] Recombinant constructs are assigned distinguishing names on the basis of their amino acid sequence length 
and their Light Chain (L-chain, L) and Heavy Chain (H-chain. H) content as these relate to translated DNA sequences 
In the public domain or specifically to SEQ ID NO: 2 and SEQ ID NO: 20. The *LH' designation is followed by VX' where 
X denotes the corresponding clostridial toxin serotype or class, e.g. 'A' for botulinum neurotoxin type A or TeTx' for 
tetanus toxin. Sequence variants from that of the native toxin polypeptide are given in parenthesis in standard format, 
namely the residue position number prefixed by the residue of the native sequence and suffixed by the residue of the 
variant. 

[0050] Subscript number prefixes indicate an amino-terminal (N-terminal) extension, or where negative a deletion, 
to the translated sequence. Similarly, subscript number suffixes indicate a carboxy temiinal (C-terminal) extension or 
where negative numbers are used, a deletion. Specific sequence Inserts such as protease cleavage sites are indicated 
using abbreviations, e.g. Factor Xa is abbreviated to FXa. L-chain C4erminal suffixes and H-chain N-terminal prefixes 
are separated by a 7' to indicate the predicted junction between the L and H-chains. Abbreviations for engineered 
ligand sequences are prefixed or suffixed to the clostridial L-chaIn or H-chaIn corresponding to their position in the 
translation product. 
[0051] Following this nomenclature, 

LH423/A = SEQ ID NO: 2, containing the entire L-chain and 423 amino acids of the H-chain of botulinum 

neurotoxin type A; 

2'-H423/A = a variant of this molecule, containing a two amino acid extension to the N-terminus of the L- 

chain; 

2L/2H423/A = a further variant in which the molecule contains a two amino acid extension on the N-terminus 

of both the L-chain and the H-chain; 

2*-FXa/2^423^A = a further variant containing a two amino acid extension to the N-termlnus of the L-chain, and 

a Factor Xa cleavage sequence at the C-terminus of the L-chain which, after cleavage of the 
molecule with Factor Xa leaves a two amino acid N-terminal extension to the H-chain com- 
ponent; and 

2'-FXa/2H423/A-IGF-1 = a variant of this molecule which has a further C-terminal extension to the H-chain, in this 
example the insulin-like growth factor 1 (IGF-1) sequence. 

[0052] There now follows description of specific embodiments of the invention, Illustrated by drawings In which: 

Fig. 1 shows a schematic representation of the domain structure of botulinum neurotoxin type A (BoNT/A); 

Fig. 2 shows a schematic representation of assembly of the gene for an embodiment of the invention designated 

LH423/A; 

Flg.3 is a graph comparing activity of native toxin, trypsin generated "native" LH^/A and an embodiment of the 
invention designated 2'-H423/A(Q2E,N26K,A27Y) in an in vitro peptide cleavage assay; 

Fig.4 is a comparison of the first 33 amino acids in published sequences of native toxin and embodiments of the 
invention; 

Fig. 5 shows the transition region of an embodiment of the Invention designated L/4H423/A illustrating insertion of 
four amino acids at the N-terminus of the H^ sequence; amino acids coded for by the Eco 47 III restriction endo- 
nuclease cleavage site are marked and the Hfg sequence then begins ALN...; 

Fig,6 shows the transition region of an embodiment of the Invention designated Lpxa/3H423^A illustrating insertion 
of a Factor Xa cleavage site at the C-terminus of the L-chain. and three additional amino acids coded for at the 
N-termlnus of the H-sequence; the N-terminal amino acid of the cleavage-activated H^ will be cysteine; 

Fig.7 shows the C-terminal portion of the amino acid sequence of an embodiment of the invention designated 
LFXa/3H423/A-IGF-1, a fusion protein; the IGF-1 sequence begins at position Gqq2» 

Fig.8 shows the C-terminal portion of the amino acid sequence of an embodiment of the invention designated 
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'-FXa/3*^423^A-CtxA14. a fusioR protein; the C-terminal CtxA sequence begins at position 0882^ 

Fig.9 shows the C-terminal portion of the amino acid sequence of an embodiment of the invention designated 
•-FXa/3*^423^A-ZZ. a fusion protein: the C-terminal ZZ sequence begins at position Aggo immediately after a genenase 
recognition site (underlined); 

Figs. 10 & 11 show schematic representations of manipulations of polypeptides of the invention; Fig. 10 shows 
LH423/A with N-terminal addition of an affinity purification peptide (in this case GST) and C-terminal addition of an 
Ig binding domain; protease cleavage sites R1, R2 and R3 enable selective enzymatic separation of domains; Fig. 
1 1 shows specific examples of protease cleavage sites R1 , R2 and R3 and a C-terminal fusion peptide sequence; 

Fig. 12 shows the trypsin sensitive activation region of a polypeptide of the invention; 

Fig. 13 shows Western blot analysis of recombinant LH^q7/B expressed from E.coli; panel A was probed with anti- 
BoNT/B antiserum; Lane 1 , molecular weight standards; lanes 2 & 3, native BoNT/B; lane 4, immunopurified LH^qj/ 
B; panel B was probed with anti-T7 peptide tag antiserum; lane 1 , molecular weight standards; lanes 2 & 3, positive 
control E.coli T7 expression; lane 4 immunopurified LH107/B. 



[0053] The sequence listing that accompanies this application contains the following sequences:- 



SEQ ID NO: 


SEQUENCE 


1 


DNA coding for LH423/A 


2 


LH423/A 


3 


DNA coding for23LH423/A (Q2E,N26K,A27Y), of which an N-terminal portion is shown in Fig. 4, 


4 


23LH423/A(Q2E.N26K,A27Y) 


5 


DNA coding for 2UH423/A (Q2E,N26K,A27Y), of which an N-terminal portion is shown in Flg.4 


6 


2LH423/A{Q2E,N26K,A27Y) 


7 


DNA coding for native BoNT/A according to Binz et al 


8 


native Bo NT/A according to Binz et al 


9 


DNA coding for L/4H423/A 


10 


'-/4'^423^A 


11 


DNA coding for l-FXa/z^42^^ 


12 




13 


DNA coding for LFXa/3H423/A-IGF-1 


14 


t-FXa/3H423/A-IGF-1 


15 


DNA coding for l-pxa/3H423/A-Cb(A14 


16 


LFXa^3H423/A-CtxA14 


17 


DNA coding for Lpxa/3H423/A-ZZ 


18 


LFXa/3H42yA-ZZ 


19 


DNA coding for LH728/B 


20 


LH728/B 


21 


DNA coding for LH417/B 


22 


LH417/B 


23 


DNA coding for LH^qj/B 


24 


LH107/B 


25 


DNA coding for LH423/A (Q2E,N26K.A27Y) 


26 


LH423/A (Q2E,N26K.A27Y) 


27 


DNA coding for LH417/B wherein the first 274 bases are modified to have an E.co// codon bias 


28 


DNA coding for LH417/B wherein bases 691-1641 of the native BoNT/B sequence have been 




replaced by a degenerate DNA coding for amino acid residues 231-547 of the native BoNT/B 




polypeptide 
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Example 1 

[0054] A 2616 base pair, double stranded gene sequence (SEQ ID NO: 1 ) has been assembled from a combination 
of synthetic, chromosomal and polymerase-chaln-reaction generated DNA (Figure 2). The gene codes for a polypeptide 
of 871 amino acid residues corresponding to the entire light-chain (LC, 448 amino acids) and 423 residues of the amino 
terminus of the heavy-chain (He) of botulinum neurotoxin type A. This recombinant product is designated the LH423/ 
A fragment (SEQ ID NO: 2), 

Construction of the recombinant product 

[0055] The first 918 base pairs of the recombinant gene were synthesised by concatenation of short oligonucleotides 
to generate a coding sequence with an E. co// codon bias. Both DNA strands In this region were completely synthesised 
as short overlapping oligonucleotides which were phosphorylated, annealed and ligated to generate the full synthetic 
region ending with a unique Kpnl restriction site. The remainder of the LH423/A coding sequence was PGR amplified 
from total chromosomal DNA from Clostridium botulinum and annealed to the synthetic portion of the gene. 
[0056] The internal PGR amplified product sequences were then deleted and replaced with the native, fully se- 
quenced, regions from clones of C. botulinum chromosomal origin to generate the final gene construct. The final com- 
position is synthetic DNA (bases 1-913), polymerase amplified DNA (bases 914-1138 and 1976-2616) and the remain- 
der is of C. botulinum chromosomal origin (bases 1139-1975). The assembled gene was then fully sequenced and 
cloned into a variety of E.coH plasmid vectors for expression analysis. 

Expression of the recombinant gene and recovery of protein product 

[0057] The DNA is expressed in E. coli as a single nucleic acid transcript producing a soluble single chain polypeptide 
of 99,951 Daltons predicted molecular weight. The gene is currently expressed in E. coil as a fusion to the commercially 
available coding sequence of glutathione S-transferase (GST) of Schistosoma japonicum but any of an extensive range 
of recombinant gene expression vectors such as pEZZI 8, pTrc99, pFLAG or the pMAL series may be equally effective 
as might expression in other prokaryotic or eukaryotic hosts such as the Gram positive bacilli, the yeast P. pastoris or 
in insect or mammalian cells under appropriate conditions. 

[0058] Currently, E. cofi harbouring the expression construct is grown in Luria-Bertani broth (L-broth pH 7.0, con- 
taining 1 0 g/l bacto-tryptone, 5 g/l bacto-yeast extract and 1 0 g/l sodium chloride) at 37° C until the cell density (biomass) 
has an optical absorbance of 0.4- 0.6 at 600 nm and the cells are in mid-logarithmic growth phase. Expression of the 
gene is then induced by addition of isopropylthlo-(3-D-galactosidase (IPTG) to a final concentration of 0.5 mM. Recom- 
binant gene expression Is allowed to proceed for 90 min at a reduced temperature of 25°C. The ceils are then harvested 
by centrifugation, are resuspended In a buffer solution containing 10 mM Na2HP04, 0,5 M NaCI. 10 mM EGTA, 0.25% 
Tween, pH 7.0 and then frozen at -20''C. For extraction of the recombinant protein the cells are disrupted by sonication. 
The cell extract is then cleared of debris by centrifugation and the cleared supernatant fluid containing soluble recom- 
binant fusion protein (GST- LH423/A) is stored at -20°C pending purification. A proportion of recombinant material is 
not released by the sonication procedure and this probably reflects insolubility or inclusion body formation. Currently 
we do not extract this material for analysis but if desired this could be readily achieved using methods known to those 
skilled in the art. 

[0059] The recombinant GST- LH423/A is purified by adsorption onto a commercially prepared affinity matrix of glu- 
tathione Sepharose and subsequent elution with reduced glutathione. The GST affinity purification marker is then 
removed by proteolytic cleavage and reabsorption to glutathione Sepharose; recombinant LH423/A is recovered in the 
non-adsorbed material. 

Construct variants 

[0060] A variant of the molecule, LH423/A (Q2E,N26K,A27Y) (SEQ ID NO: 26) has been produced in which three 
amino acid residues have been modified within the light chain of LH423/A producing a polypeptide containing a light 
chain sequence different to that of the published amino acid sequence of the light chain of BoNT/A . 
[0061] Two further variants of the gene sequence that have been expressed and the corresponding products purified 
are 23LH423/A (Q2E,N26K,A27Y) (SEQ ID NO: 4) which has a 23 amino acid N-terminal extension as compared to the 
predicted native L-chain of BoNT/A and 2LH423/A (Q2E,N26K,A27Y) (SEQ ID NO: 6) which has a 2 amino acid N-terminal 
extension (Figure 4). 

[0062] In yet another variant a gene has been produced which contains a Eco 47 III restriction site between nucle- 
otides 1344 and 1345 of the gene sequence given in (SEQ ID NO: 1). This modification provides a restriction site at 
the position in the gene representing the interface of the heavy and light chains in native neurotoxin, and provides the 
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capability to make insertions at this point using standard restriction enzyme methodologies known to those skilled in 
the art. It will also be obvious to those skilled in the art that any one of a number of restriction sites could be so employed, 
and that the Eco 47 IN insertion simply exemplifies this approach. Similarly, it would be obvious for one skilled in the 
art that insertion of a restriction site in the manner described could be performed on any gene of the invention. The 
gene described, when expressed, codes for a polypeptide, L/4H423/A (SEQ ID NO: 10). which contains an additional 
four amino acids between amino acids 448 and 449 of LH423/A at a position equivalent to the amino terminus of the 
heavy chain of native BoNT/A. 

[0063] A variant of the gene has been expressed, Lpxa/3H423/A (SEQ ID NO: 12), in which a specific proteolytic 
cleavage site was incorporated at the carboxy-tenninal end of the light chain domain, specifically after residue 448 of 
L/4H423/A. The cleavage site incorporated was for Factor Xa protease and was coded for by modification of SEQ ID 
NO: 1. It will be apparent to one skilled in the art that a cleavage site for another specified protease could be similariy 
incorporated, and that any gene sequence coding for the required cleavage site could be employed. Modification of 
the gene sequence in this manner to code for a defined protease site could be performed on any gene of the invention. 
[0064] Variants of LFXa/3l^423/A have been constructed in which a third domain is present at the carboxy-terminal 
end of the polypeptide which incorporates a specific binding activity into the polypeptide. 
[0065] Specific examples described are: 

C') *-FXa/3^423^A-IGF-1 (SEQ ID NO: 14) , in which the carboxy-terminal domain has a sequence equivalent to that 
of insulin-like growth factor-1 (IGF-1 ) and is able to bind to the insulin-like growth factor receptor with high affinity; 

(2) Lpxa/3H423/A-Cb<A14 (SEQ ID NO: 16) , in which the carboxy-terminal domain has a sequence equivalent to 
that of the 14 amino acids from the carboxy-termlnus of the A-subunit of cholera toxin (CbcA) and is thereby able 
to interact with the cholera toxin B-subunit pentamer; and 

(3) LFxa/3H423/A-ZZ (SEQ ID NO: 18) , in which the carboxy-terminal domain is a tandem repeating synthetic IgG 
binding domain. This variant also exemplifies another modification applicable to the current invention, namely the 
inclusion in the gene of a sequence coding for a protease cleavage site located between the end of the clostridial 
heavy chain sequence and the sequence coding for the binding ligand. Specifically in this example a sequence is 
inserted at nucleotides 2650 to 2666 coding for a genenase cleavage site. Expression of this gene produces a 
polypeptide which has the desired protease sensitivity at the interface between the domain providing function 
and the binding domain. Such a modification enables selective removal of the C-terminal binding domain by treat- 
ment of the polypeptide with the relevant protease. 

[0066] It will be apparent that any one of a number of such binding domains could be incorporated into the polypeptide 
sequences of this invention and that the above examples are merely to exemplify the concept. Similariy, such binding 
domains can be incorporated into any of the polypeptide sequences that are the basis of this invention. Further, it 
should be noted that such binding domains could be Incorporated at any appropriate location within the polypeptide 
molecules of the invention. 

[0067] Further embodiments of the invention are thus illustrated by a DNA of the invention further comprising a 
desired restriction endonuclease site at a desired location and by a polypeptide of the invention further comprising a 
desired protease cleavage site at a desired location. 

[0068] The restriction endonuclease site may be introduced so as to facilitate further manipulation of the DNA in 
manufacture of an expression vector for expressing a polypeptide of the invention; it may be introduced as a conse- 
quence of a previous step in manufacture of the DNA; it may be introduced by way of modification by insertion, sub- 
stitution or deletion of a known sequence. The consequence of modification of the DNA may be that the amino acid 
sequence is unchanged, or may be that the amino acid sequence is changed, for example resulting in introduction of 
a desired protease cleavage site, either way the polypeptide retains its first and second domains having the properties 
required by the invention. 

[0069] Figure 10 is a diagrammatic representation of an expression product exemplifying features described in this 
example. Specifically, it illustrates a single polypeptide incorporating a domain equivalent to the light chain of botulinum 
neurotoxin type A and a domain equivalent to the domain of the heavy chain of botulinum neurotoxin type A with 
a N-terminal extension providing an affinity purification domain, namely GST, and a C-terminal extension providing a 
ligand binding domain, namely an IgG binding domain. The domains of the polypeptide are spatially separated by 
specific protease cleavage sites enabling selective enzymatic separation of domains as exemplified in the Figure. This 
concept is more specifically depicted in Figure 11 where the various protease sensitivities are defined for the purpose 
of example. 
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Assay of product activity 

[0070] The LC of botulinum neurotoxin type A exerts a zinc-dependent endopeptidase activity on [he synaptic vesicle 
associated protein SNAP-25 which it cleaves In a specific nrianner at a single peptide bond. The 2LH423/A (Q2E,N26K, 
A27Y) (SEQ ID NO: 6) cleaves a synthetic SNAP-25 substrate in vitro under the same conditions as the native toxin 
(Figure 3). Thus, the nfiodlfication of the polypeptide sequence of2LH423/A (Q2E,N26K,A27Y) relative to the native 
sequence and within the minimal functional LC domains does not prevent the functional activity of the LC domains. 
[0071] This activity is dependent on proteolytic modification of the recombinant GST-2LH423/A (Q2E,N26K,A27Y) to 
convert the singie chain polypeptide product to a disulphide linked dichain species. This is currently done using the 
proteolytic enzyme trypsin. The recombinant product (100-600 ^ig/ml) is Incubated at 37°C for 10-50 minutes with 
trypsin (10 fig/ml) In a solution containing 140 mM NaCI, 2.7 mM KCI, 10 mM Na2HP04, 1.8 mM KH2PO4, pH 7.3. The 
reaction is terminated by addition of a 100-fold molar excess of trypsin inhibitor. The activation by trypsin generates a 
disulphide linked dichain species as determined by polyacrylamlde gel electrophoresis and immunobiotting analysis 
using polyclonal anti-botuHnum neurotoxin type A antiserum. 

[0072] 2'-*^423/A is more stable in the presence of trypsin and more active in the in vitro peptide cleavage assay than 
is 23LH423/A. Both variants, however, are fully functional in the in vitro peptide cleavage assay. This demonstrates that 
the recombinant molecule will tolerate N-terminal amino acid extensions and this may be expanded to other chemical 
or organic moieties as would be obvious to those skilled in the art. 

Example 2 

[0073] As a further exemplification of this invention a number of gene sequences have been assembled coding for 
polypeptides corresponding to the entire light-chain and varying numbers of residues from the amino terminal end of 
the heavy chain of botulinum neurotoxin type B. In this exemplification of the disclosure the gene sequences assembled 
were obtained from a combination of chromosomal and polymerase-chain-reaction generated DNA. and therefore have 
the nucleotide sequence of the equivalent regions of the natural genes, thus exemplifying the principle that the sub- 
stance of this disclosure can be based upon natural as well as a synthetic gene sequences. 

[0074] The gene sequences relating to this example were all assembled and expressed using methodologies as 
detailed in Sambrool< J, Fritsch E F & Maniatis T (1989) Molecular Cloning: A Laboratory Manual (2nd Edition), Ford 
N, Nolan C, Ferguson M & Ockler M (eds), Cold Spring Harbor Laboratory Press, New York, and known to those skilled 
In the art. 

[0075] A gene has been assembled coding for a polypeptide of 1171 amino acids corresponding to the entire light- 
chain (443 amino acids) and 728 residues from the amino terminus of the heavy chain of neurotoxin type B. Expression 
of this gene produces a polypeptide, LHj2q/B (SEQ ID NO: 20), which lacks the specific neuronal binding activity of 
full length BoNT/B. 

[0076] A gene has also been assembled coding for a variant polypeptide, LH417/B (SEQ ID NO: 22). which possesses 
an amino acid sequence at Its carboxy terminus equivalent by amino acid homology to that at the carboxy-termlnus of 
the heavy chain fragment in native LHf^A . 

[0077] A gene has also been assembled coding for a variant polypeptide, LH^qj/B (SEQ ID NO: 24) , which expresses 
at its carboxy-terminus a short sequence from the amino tenminus of the heavy chain of BoNT/B sufficient to maintain 
solubility of the expressed polypeptide. 

Construct Variants 

[0078] A variant of the coding sequence for the first 274 bases of the gene shown in SEQ ID NO: 21 has been 
produced which whilst being a non-native nucleotide sequence still codes for the native polypeptide. 
[0079] Two double stranded, a 268 base pair and a 951 base pair, gene sequences have been created using an 
overlapping primer PCR strategy. The nucleotide bias of these sequences was designed to have an £.co// codon usage 

bias. 

[0080] For the first sequence, six oligonucleotides representing the first (5') 268 nucleotides of the native sequence 
for botulinum toxin type B were synthesised. For the second sequence 23 oligonucleotides representing internal se- 
quence nucleotides 691 -1 641 of the native sequence for botulinum toxin type B were synthesised. The oligonucleotides 
ranged from 57-73 nucleotides in length. Overlapping regions. 17-20 nucleotides, were designed to give melting tem- 
peratures In the range 52-56°C. in addition, terminal restriction endonuclease sites of the synthetic products were 
constructed to facilitate insertion of these products into the exact corresponding region of the native sequence. The 
268 bp 5' synthetic sequence has been incorporated into the gene shown in SEQ ID NO: 21 in place of the original 
first 268 bases (and is shown in SEQ ID NO: 27). 

SImilariy the sequence could be inserted into other genes of the examples. 
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[0081] Another variant sequence equivalent to nucleotides 691 to 1641 of SEQ ID NO: 21 , and employing non- 
native codon usage whilst coding for a native polypeptide sequence, has been constructed using the internal synthetic 
sequence. This sequence (SEQ ID NO: 28) can be incorporated, alone or In combination with other variant sequences, 
in place of the equivalent coding sequence in any of the genes of the example. 

Example 3 

[0082] An exemplification of the utility of this invention is as a non-toxic and effective immunogen. The non-toxic 
nature of the recombinant, single chain material was demonstrated by Intraperitoneal administration in mice of 
GST-2LH423/A. The polypeptide was prepared and purified as described above. The amount of immunoreactive material 
in the final preparation was determined by enzyme linked immunosorbent assay (ELISA) using a monoclonal antibody 
(BA 1 1 ) reactive against a conformation dependent epitope on the native LHn/A. The recombinant material was serially 
diluted In phosphate buffered saline (PBS; NaCI 8 g/l, KCl 0.2 g/l, Na2HP04 1.15 g/l, KH2PO4 0.2 g/l. pH 7.4) and 0.6 
ml volumes Injected into 3 groups of 4 mice such that each group of mice received 10. 5 and 1 micrograms of material 
respectively. Mice were observed for 4 days and no deaths were seen. 

[0083] For immunisation, 20 ^g of GST-2LH423/A in a 1.0 ml volume of water-in-oil emulsion (1:1 volrvol) using Fre- 
und's complete (primary injections only) or Freund's incomplete adjuvant was administered into guinea pigs via two 
sub-cutaneous dorsal injections. Three injections at 10 day intervals were given (day 1, day 10 and day 20) and an- 
tiserum collected on day 30. The antisera were shown by ELISA to be immunoreactive against native botulinum neu- 
rotoxin type A and to Its derivative LH|sj/A. Antisera which were botulinum neurotoxin reactive at a dilution of 1 :2000 
were used for evaluation of neutralising efficacy In mice. For neutralisation assays 0.1 ml of antiserum was diluted Into 
2.5 ml of gelatine phosphate buffer (GPB; Na2HP04 anhydrous 10 g/l, gelatin (DIfco) 2 g/l, pH 6.5-6.6) containing a 
dilution range from 0.5 jag (5X1 0-^ g) to 5 picograms (5X10-12 g) Aliquots of 0.5 ml were Injected into mice intraperi- 
toneally and deaths recorded over a 4 day period. The results are shown In Table 1 and Table 2, It can cleariy be seen 
that 0.5 ml of 1:40 diluted anti- GST-2LH423/A antiserum can protect mice against intraperitoneal challenge with botu- 
linum neurotoxin In the range 5 pg - 50 ng (1 - 10,000 mouse LD50: 1 mouse LD50 = 5 pg). 



TABLE 1. 



Neutralisation of botulinum neurotoxin in mice by guinea pig anti-GST-2LH423/A antiserum. 
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TABLE 2. 

Neutralisation of botulinum neurotoxin in mice by non-immune guinea pig antiserum. 



Botulinum Toxin/mouse 


Survivors On Day 


O.S^ig 


0. 005)1 g 


O.OOOSfig 


O.Sng 


O.OOSng 


5pg 


Control (no toxin) 
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4 
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Example 4 

Expression of recombinant LH^qj/B in E. coli. 

[0084] As an exemplification of the expression of a nucleic acid coding for a LH^ of a clostridial neurotoxin of a 
serotype other than botulinum neurotoxin type A, the nucleic acid sequence (SEQ ID NO: 23) coding for the polypeptide 
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LH^Qj/B (SEQ ID NO: 24) was inserted into the commercially available plasmid pET28a (Novogen, Madison, Wl, USA). 
The nucleic acid was expressed in E. coli BL21 (DE3) (New England BioLabs. Beverley, MA, USA) as a fusion protein 
with a N-terminal T7 fusion peptide, under IPTG induction at 1 mM for 90 minutes at 37°C. Cultures were harvested 
and recombinant protein extracted as described previously for LH423/A. 

[0085] Recombinant protein was recovered and purified from bacterial paste lysates by immunoaffinity adsorption 
to an immobilised antl-T7 peptide monoclonal antibody using a T7 tag purification kit (New England bioLabs, Beverley, 
MA, USA). Purified recombinant protein was analysed by gradient (4-20%) denaturing SDS-polyacrylamide gel elec- 
trophoresis (Novex, San Diego, CA, USA) and western blotting using polyclonal anti-botulinum neurotoxin type antise- 
rum or anti-T7 antiserum. Western blotting reagents were from Novex, immunostained proteins were visualised using 
the Enhanced Chemi-Luminescence system (ECL) from Amersham. The expression of an anti-T7 antibody and antl- 
botullnum neurotoxin type B antiserum reactive recombinant product is demonstrated in Figure 13. 
[0086] The recombinant product was soluble and retained that part of the light chain responsible for endopeptidase 
activity. 

[0087] The invention thus provides recombinant polypeptides useful inter alia as immunogens, enzyme standards 
and components for synthesis of molecules as described in WO-A-94/21300. 

SEQUENCE LISTING 

[0088] 

(1) GENERAL INFORMATION: 
(i) APPLICANT: 

(A) NAME: MICROBIOLOGICAL RESEARCH AUTHORITY 

(B) STREET: Centre For Applied Microbiology And Research. Porton Down 

(C) CITY: Salisbury 

(D) STATE: Wiltshire 

(E) COUNTRY: UK 

(F) POSTAL CODE (ZIP): SP4 OJG 

(A) NAME: THE SPEYWOOD LABORATORY LIMITED 

(B) STREET: 14 Kensington Square 

(C) CITY: London 

(E) COUNTRY: UK 

(F) POSTAL CODE (ZIP): W8 5HH 

(A) NAME: FOSTER; Keith Alan 

(B) STREET: Centre For Applied Microbiology And Research, Porton Down 

(C) CITY: Salisbury 

(D) STATE: Wiltshire 

(E) COUNTRY: UK 

(F) POSTAL CODE (ZIP): SP4 OJG 

(A) NAME: QUINN; Conrad Padraig 

(B) STREET: Centre For Applied Microbiology And Research, Porton Down 

(C) CITY: Salisbury 

(D) STATE: Wiltshire 

(E) COUNTRY: UK 

(F) POSTAL CODE (ZIP): SP4 OJG 

(A) NAME: SHONE; Clifford Charles 

(B) STREET: Centre For Applied Microbiology And Research. Porton Down 

(C) CITY: Salisbury 

(D) STATE: Wiltshire 

(E) COUNTRY: UK 

(F) POSTAL CODE (ZIP): SP4 OJG 
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(ii) TITLE OF INVENTION: Recombinant Toxin Fragments 

(iii) NUMBER OF SEQUENCES: 28 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1,30 (EPO) 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2616 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1.. 261 6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
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10 



15 



20 



ATG CAG TTC GTG AAC AAG CAG TTC AAC TAT AAG GAC CCT GTA AAC GGT 4 8 

Met Gin Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 
1 5 10 15 

GTT GAC ATT GCC TAC ATC AAA ATT CCA AAC GCC GGC CAG ATG CAG CCG 96 
Val Asp lie Ala Tyr lie Lys lie Pro Asn Ala Gly Gin Met Gin Pro 
20 25 30 

GTG AAG GCT TTC AAG ATT CAT AAC AAA ATC TGG GTT ATT CCG GAA CGC 144 

Val Lys Ala Phe Lys lie His Asn Lys He Trp Val He Pro Glu Aro 
35 40 45 

GAT ACA TTT ACG AAC CCG GAA GAA GGA GAC TTG AAC CCG CCG CCG GAA 192 
Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
50 55 60 

GCA AAG CAG GTG CCA GTT TCA TAC TAC GAT TCA ACC TAT CTG AGC ACA 240 
Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
65 70 75 60 

GAC AAC GAG AAG GAT AAC TAC CTG AAG GGA GTG ACC AAA TTA TTC GAG 288 
Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 
85 90 95 

CGT ATT TAT TCC ACT GAC CTG GGC OCT ATG CTG CTG ACC TCA ATC GTC 336 
Arg He Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser He Val 
100 105 110 

CGC GGA ATC CCA TTT TGG GGT GGC AGT ACC ATT GAC ACG GAG TTG AAG 384 
Arg Gly He Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu Leu Lys 
115 120 125 

GTT ATT GAC ACT AAC TGC ATT AAC GTG ATC CAA CCA GAC GGT AGC TAC 432 
Val He Asp Thr Asn Cys He Asn Val He Gin Pro Asp Gly Ser Tyr 
130 135 140 

AGA TCT GAA GAA CTT AAC CTC GTA ATC ATC GGG CCC TCC GCG GAC ATT 480 
Arg Ser Glu Glu Leu Asn Leu Val He He Gly Pro Ser Ala Asp He 
145 150 155 160 

ATC CAG TTT GAG TGC AAG AGC TTT GGC CAC GAA GTG TTG AAC CTG ACG 528 
Xle Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 170 175 

CGT AAC GGT TAC GGC TCT ACT CAG TAC ATT CGT TTC AGC CCA GAC TTC 576 
Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 
'^^ X80 185 190 

ACG TTC GGT TTC GAG GAG AGC CTG GAG GTT GAT ACC AAC CCG CTG TTG 624 
Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro X^u Leu 
195 200 205 

45 GCA GGC AAG TTC GCA ACT GAT CCA GCG GTG ACC CTG GCA CAC GAG 672 

Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 

210 215 220 

CTG ATC CAC GCC GGT CAT CGT CTG TAT GGC ATT GCG ATT AAC CCG AAC 720 
Leu He His Ala Gly His Arg Leu Tyr Gly He Ala He Asn Pro Asn 
50 225 230 235 240 



55 



30 



35 
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CGC GTG TTC AAG GTT AAC ACC AAC GCC TAC TAC GAG ATG AGT GGT TTA 768 
Arg Val Pne Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
245 250 255 

^ GAA GTA AGC TTC GAG GAA CTG CGC ACG TTC GGT GCC CAT GAT GCG AAG 816 

Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 
260 265 270 

TTT ATC GAC AGC TTG GAG GAG AAC GAG TTC CGT CTG TAC TAC TAC AAC 864 
Phe lie Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
10 275 280 285 

AAG TTT AAA GAT ATT GCA AGT ACA CTG AAC AAG GCT AAG TCC ATT GTG 912 

Lys Phe Lys Asp lie Ala Ser Thr Leu Asn Lys Ala Lys Ser lie Val 

290 295 300 



15 



20 



GGT ACC ACT GCT TCA TTA CAG TAT ATG AAA AAT GTT TTT AAA GAG AAA 960 
Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 31^0 315 320 

TAT CTC CTA TCT GAA GAT ACA TCT GGA AAA TTT TCG GTA GAT AAA TTA 1008 
Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 
325 330 335 

AAA TTT GAT AAG TTA TAC AAA ATG TTA ACA GAG ATT TAC ACA GAG GAT 1056 
Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu lie Tyr Thr Glu Asp 
340 345 350 

AAT TTT GTT AAG TTT TTT AAA GTA CTT AAC AGA AAA ACA TAT TTQ AAT 1104 
25 Asn Phe val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 

355 360 365 

TTT GAT AAA GCC GTA TTT AAG ATA AAT ATA GTA CCT AAG GTA AAT TAC 1152 
Phe Asp Lys Ala Val Phe Lys lie Asn Zle Val Pro Lys Val Asn Tyr 
370 375 380 

30 

ACA ATA TAT GAT GGA TTT AAT TTA AGA AAT ACA AAT TTA OCA GCA AAC 1200 
Thr Zle Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 395 400 

TTT AAT GGT CAA AAT ACA GAA ATT AAT AAT ATO AAT TTT ACT AAA CTA 1248* 
Phe Asn Gly Gin Asn Thr Glu lie Asn Asn Met Asn Phe Thr Lys Leu 
35 405 410 415 

AAA AAT TTT ACT GGA TTG TTT GAA TTT TAT AAO TTG CTA TGT GTA AGA 1296 
Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 
420 425 430 

40 GGG ATA ATA ACT TCT AAA ACT AAA TCA TTA GAT AAA GGA TAC AAT AAG 1344 

Gly lie He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 445 

GCA TTA AAT GAT TTA TGT ATC AAA GTT AAT AAT TGG GAC TTG TTT TTT 1392 
Ala Leu Asn Asp Leu Cys He Lys Val Asn Asn Txp Asp Leu Phe Phe 
450 455 460 

45 

AGT CCT TCA GAA GAT AAT TTT ACT AAT OAT CTA AAT AAA GGA GAA GAA 1440 
Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu 
465 470 475 480 

ATT ACA TCT GAT ACT AAT ATA GAA CCA GCA GAA GAA AAT ATT AGT TTA 1488 
50 He Thr Ser Asp Thr Asn He Glu Ala Ala Glu Glu Asn He Ser Leu 

485 490 495 

GAT TTA ATA CAA CAA TAT TAT TTA ACC TTT AAT TTT GAT AAT GAA CCT 1536 
Asp Leu He Gin Gin Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro 
500 505 510 

55 
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OAA AAT ATT TCA ATA GAA AAT CTT TCA AGT GAC ATT ATA GGC CAA TTA 
Glu Asn lie Ser He Glu Asn Leu Ser Ser Asp He He GXy Gin Leu 

5x5 520 525 

5 GAA CTT ATG CCT AAT ATA GAA AGA TTT CCT AAT GGA AAA AAG TAT GAG 

Glu Leu Met Pro Asn lie Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu 
530 535 540 ^ 

TTA GAT AAA TAT ACT ATG TTC CAT TAT CTT CGT GCT CAA GAA TTT GAA 
70 545 ^^"^ slo ^^"^ ^^"^ ^^"^ sif Glu 

CAT GGT AAA TCT AGG ATT GCT TTA ACA AAT TCT GTT AAC GAA GCA TTA 



15 



25 



30 



35 



40 



45 



50 



55 



— - 7 r • r»wrt ru-k* *ui Vji I AAW \SJ\A GCA 

HxB Gly L.ys Ser Arg He Ala Leu Thr Asn Ser Val Asn Glu Ala 
S65 570 



575 



Leu 



TTA AAT CCT AGT CGT GTT TAT ACA TTT TTT TCT TCA GAC TAT GTA AAG 
Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys 

sao sas sso 

WUl GTT AAT AAA GCT ACG GAG GCA GCT ATG TTT TTA GGC TGG GTA GAA 
Lys Val Asn. Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu 

20 605 

CAA TTA GTA TAT GAT ITT ACC GAT GAA ACT AGC GAA GTA AGT ACT ACG 
Gin Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr 
^10 615 620 



S5S 



1584 



1632 



1680 



1728 



1776 



1624 



1872 



GAT AAA ATT GCG GAT ATA ACT ATA ATT ATT CCA TAT ATA GGA CCT GCT 1030 
Asp Lys He Ala Asp lie Thr He He He Pro Tyr lie Gly Pro Ala 

TTA AAT ATA GGT AAT ATG TTA TAT AAA GAT GAT TTT GTA GOT GCT TTA ISfia 
Leu Asn He Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Oly Ala Leu 
645 



ATA TTT TCA GGA GCT GTT ATT CTO TTA GAA TTT ATA CCA GAG ATT GCA 2nifi 
He Phe Ser Gly Ala Val He Leu Leu Glu Phe He Pro Glu lie Ala 

€£S 

ATA CCT GTA TTA GGT ACT TTT GCA CTT GTA TCA TAT ATT GCG AAT AAG 20fid 
He Pro val Leu Gly Thr Phe Ala Leu Val Ser Tyr He Ala Asn Lys 
fi'^S 680 ^ 

GTT CTA ACC GTT CAA ACA ATA GAT AAT GCT TTA AGT AAA AGA AAT GAA ^ o 

Val Leu Thr Val Gin Thr He Asp Asn Ala Leu sS ^ ^ 
650 695 700 

AAA TOG GAT GAG GTC TAT AAA TAT ATA GTA ACA AAT TGG TTA OCA AAG 2160 
hya Trp Asp Glu Val Tyr Lys Tyr He Val Thr Asn Ttp Leu Ala L^ 
705 710 7X5 720 

GTT AAT ACA CAQ ATT GAT CTA ATA AGA AAA AAA ATG AAA GAA GCT TTA 220fl 
Val Asn Thr Gin He Asp Leu He Arg Lya Lys Met Lys Glu Ala Leu 
725 730 735 

OAA AAT CAA GCA GAA GCA ACA AAG GCT ATA ATA AAC TAT CAG TAT AAT 23e« 
Glu Asn Gin Ala Glu Ala Thr Lys Ala He He Asn lyr Gin Tyr Asn 
740 745 750 

CAA TAT ACT GAG GAA GAG AAA AAT AAT ATT AAT TTT AAT ATT GAT GAT 2304 
Gin Tyr Thr Glu Glu Glu Lys Asn Asn He Asn Phe Asn He Asp Asp 
755 760 765 

TTA AGT TCG AAA CTT AAT GAG TCT ATA AAT AAA GCT ATO ATT AAT ATA 23S2 
Leu Ser Ser Lys Leu Asn Glu Ser He Asn Lys Ala Met He Asn He 
770 775 700 
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AAT AAA TTT TTG AAT CAA TGC TCT GTT TCA TAT TTA ATG AAT TCT ATG 
Asn Lys Phe Leu Asn Gin Cys Ser Val Ser Tyr Leu Met Asn Ser Met 
"^^5 ''^^ 795 aoo 

ATC CCT TAT GGT GTT AAA CGG TTA GAA GAT TTT GAT GCT AGT CTT AAA 
He Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu Lvs 
«05 810 815 

GAT GCA TTA TTA AAG TAT ATA TAT GAT AAT AGA GGA ACT TTA ATT GGT 
Asp Ala Leu Leu Lys Tyr He Tyr Asp Asn Arg Gly Thr Leu lie Glv 
820 825 830 

CAA GTA GAT AGA TTA AAA GAT AAA GTT AAT AAT ACA CTT AGT ACA GAT 
Gin Val Asp Arg Leu Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp 
83S 840 84S 

ATA CCT TTT CAG CTT TCC AAA TAC GTA GAT AAT CAA AGA TTA TTA TCT 
He Pro Phe Gin Leu Ser Lys Tyr Val Asp Asn Gin Arg Leu Leu Ser 
850 855 860 

ACA TTT ACT GAA TAT ATT AAG TAA 
Thr Phe Thr Glu Tyr He Lys • 
865 870 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 872 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
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Met Gin Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Oly 
15 10 IS 

Val Asp 11^ Ala Tyr lie Lys lie Pro Asn Ala Gly Gin Met Gin Pro 
20 25 30 

Val Lys Ala Phe Lys lie His Asn Lys lie Trp val lie Pro Glu Arg 
35 40 45 

Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
50 S5 60 

Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
S5 70 75 

Asp Asn Glu Lya Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 
85 90 95 

Arg lie Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser lie Val 
100 105 xiO 

Arg Gly lie Pro Phe Trp Oly Gly Ser Thr lie Asp Thr Glu Leu Lys 
115 120 

Val He Asp Thr Asn Cys He Asn Val He Gin Pro Asp Gly Ser Tvr 
130 135 140 ^ 

Arg Ser Glu Glu Leu Asn Leu Val He He Oly Pro Ser Ala Asp He 
145 150 155 160 

He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 170 175 
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Axg Aan Qiy Tyr Qly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 
Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 



20S 



Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 

2X0 215 



220 



Leu He His Ala Gly His Arg Leu Tyr Gly He Ala He Asn Pro Asn 

230 235 240 

Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
245 250 255 

Glu Val ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 



270 



Phe He Asp Ser l^eu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
275 280 285 

Lys Phe Lys Asp He Ala Ser Thr Leu Aan Lys Ala Lys Ser He Val 

290 295 

Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 

310 220 

Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 

330 33S 

Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu He Tyr Thr Glu Aen 
340 345 350 ^ 

Asn Phe Val Lys Phe Phe Lys Val Leu Aan Arg Lys Thr Tyr Leu Asn 
355 3S0 ~ 



365 



Phe Asp Lys Ala Val Phe Lys He Asn He Val Pro Lys Val Asn Tyx 
370 375 390 

Thr He Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 

390 395 

Phe Asn Gly Gin Asn Thr Glu He Asn Asn Met Asn Phe Thr Lys Leu 
• 405 4ao 

Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lya Leu Leu Cys Val Ara 
420 425 

Gly He He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lya 
435 440 445 

Ala Leu Asn Asp Leu CJys He Lys Val Asn Asn Trp Asp Leu Phe Phe 
450 455 450 

Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu 
465 470 475 450 

He Thr Ser Asp Thr Asn He Glu Ala Ala Glu Glu Asn He Ser Leu 
485 490 495 

Asp Leu He Gin Gin Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro 
500 505 

Glu Asn He Ser He Glu Asn Leu Ser Ser Asp He He Gly Gin Leu 
515 520 525 
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Giu Leu Met Pro Asn lie Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu 
530 535 540 

Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gin Glu Phe Glu 
545 550 555 560 

His Gly Lys Ser Arg lie Ala Leu Thr Asn Ser Val Asn Glu Ala Leu 
565 5-70 575 

Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys 

580 585 590 

Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu 
595 600 S05 

Gin Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr 
610 €15 620 

Asp Lys He Ala Asp He Thr Xle He He Pro Tyr He Gly Pro Ala 
625 630 635 640 

Leu Asn Xle Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu 
645 650 6SS 

He Phe Ser Gly Ala Val He Leu Leu Glu Phe He Pro Glu He Ala 
660 665 670 

He Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr He Ala Asn Lys 
675 680 685 

Val Leu Thr Val Gin Thr He Asp Asn Ala Leu Ser Lys Arg Asn Glu 
690 695 700 

Lys Txp Asp Glu Val Tyr Lys Tyr He Val Thr Asn Trp Leu Ala Lys 
705 710 715 720 

Val Asn Thr Gin He Asp Leu He Arg Lys Lys Met Lys Glu Ala Leu 
725 730 735 

Glu Asn Gin Ala Glu Ala Thr Lys Ala He He Asn Tyr Gin Tyr Asn 
740 745 750 

Gin Tyr Thr Glu Glu Glu Lys Asn Asn He Asn Phe Asn He Asp Asp 
7S5 760 765 

Leu Ser Ser Lys Leu Asn Glu Ser He Asn Lys Ala Met He Asn He 
770 775 780 

Asn Lys Phe Leu Asn Gin Cys Ser Val Ser Tyr Leu Met Asn Ser Met 
7B5 790 795 800 

He Pro Tyr Gly Val Lys Arg Leu Glu Asp PhiT Asp Ala Ser Leu Lys 
aOS 810 BIS 

Asp Ala Leu Leu Lys Tyr He Tyr Asp Asn Arg Gly Thr Leu He Gly 
820 825 830 

Gin Val Asp Arg lieu Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp 
835 840 845 

He Pro Phe Gin Leu Ser Lys Tyr Val Asp Asn Gin Arg Leu Leu Ser 
850 855 860 

Thr Phe Thr Glu Tyr He Lys • 
865 870 



(2) INFORMATION FOR SEQ ID NO: 3: 
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(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2685 base pairs 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

10 (ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION:1..2685 

^5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

20 

25 

30 

35 

40 

45 

50 
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30 



35 



40 



45 



GGA TCC CCA GGA ATT CAT ATG ACG TCG ACQ CGT CTG CAG AAG CTT OTa 
Gly Scr Pro Gly lie His Met Thr Scr Thr Arg Leu Gin Lys Leu Leu 
15 10 15 

GAA TTC GAG CTC CCG GOT ACC ATG GAG TTC GTG AAC AAG CAG TTC AAC 
Glu Phe Glu Leu Pro Gly Thr Met Glu Phe Val Asn Lys Gin Phe Asn 
20 25 30 



48 



96 



TAT AAG GAC CCT GTA AAC GGT GTT GAC ATT GCC TAC ATC AAA ATT CCA 144 
Tyr Lys Asp Pro Val Asn Gly Val Asp lie Ala Tyr lie Lys lie Pro 
35 40 45 

AAG TAC GGC CAO ATG CAO CCG GTG AAG GCT TTC AAG ATT CAT AAC AAA 192 
Lys Tyr Gly Gin Met Gin Pro Val Lys Ala Phe Lys He His Asn Lys 
SO 55 60 

15 ATC TGG GTT ATT CCG GAA CGC GAT ACA TTT ACG AAC CCG GAA GAA GGA 240 

He Trp Val lie Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Glv 

65 70 75 

GAC TTG AAC CCG CCG CCG GAA GCA AAG CAG GTG CCA GTT TCA TAC TAC 288 
Asp Leu Asn Pro Pro Pro GXu Ala Lys Gin Val Pro Val Ser Tyr Tyr 

20 85 90 95 

GAT TCA ACC TAT CTG AGO ACA GAC AAC GAG AAG GAT AAC TAC CTG AAG 336 
Asp Ser Thr Tyr Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys 
100 105 1X0 

25 GGA GTG ACC AAA TTA TTC GAG CGT ATT TAT TCC ACT GAC CTG GGC CGT 384 

Gly Val Thr Lys Leu Phe Glu Arg He Tyr Ser Thr Asp Leu Gly Ara 
115 120 125 

ATG CTG CTG ACC TCA ATC GTC CGC GGA ATC CCA TTT TGG GGT GGC AGT 432 
Met Leu Leu Thr Ser He Val Arg Gly He Pro Phe Trp Glv Glv £er 
130 135 140 



ACC ATT GAC ACG GAG TTG AAG GTT ATT GAC ACT AAC TGC ATT AAC GTG 480 

Thr He Asp Thr Glu Leu Lys Val He Asp Thr Asn Cys He Asn Val 
145 ISO 155 160 

ATC CAA CCA GAC GGT AGC TAC AGA TCT GAA GAA CTT AAC CTC GTA ATC 528 
He Gin Pro Asp Gly Ser Tyr Arg Sex Glu Glu Leu Asn Leu Val He 
155 170 175 

ATC GGG CCC TCC GCG GAC ATT ATC CAG TTT GAG TGC AAG AGC TTT GGC 576 

He Gly Pro Ser Ala Asp He He Gin Phe Glu Cys Lys Ser Phe Gly 

180 185 190 

CAC GAA GTG TTG AAC CTG ACG CGT AAC GGT TAC GGC TCT ACT CAG TAC 624 
His Glu Val Leu Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gin Tyr 
195 200 205 



50 



55 
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ATT CGT TTC AGC CCA GAC TTC ACG TTC GGT TTC GAG GAG AGO CTG GAG 672 
lie Arg Phe Ser Pro Asp Phe Thr Phe Gly Phe Glu GXu Ser Leu Glu 
210 215 220 

^ GTT GAT ACC AAC CCG CTG TTG GGT GCA GGC AAG TTC GCA ACT GAT CCA 720 

Val Asp Thr Asn Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro 
225 230 235 240 

GCG GTG ACC CTG GCA CAC GAG CTG ATC CAC GCC GGT CAT CGT CTG TAT 768 
Ala Val Thr Leu Ala His Glu Leu He His Ala Gly His Arg Leu Tyr 
10 245 250 255 

GGC ATT GCG ATT AAC CCG AAC CGC GTG TTC AAG GTT AAC ACC AAC GCC 816 
Gly He Ala He Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala 
260 265 270 

15 TAG TAC GAG ATG AGT GGT TTA GAA GTA AGC TTC GAG GAA CTG CGC ACG d64 

Tyr Tyr Glu Met Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr 
275 280 285 



20 



25 



30 



35 



45 



50 



55 



TTC GGT GGC CAT GAT GCG AAG TTT ATC GAC AGC TTG CAG GAG AAC GAG 912 
Phe Gly Gly His Asp Ala Lys Phe He Asp Ser Leu Gin Glu Asn Glu 
290 295 300 

TTC CGT CTG TAC TAC TAC AAC AAG TTT AAA GAT ATT GCA AGT ACA CTG 960 
Phe Arg Leu Tyr Tyr Tyr Asn Lys Phe Lys Asp He Ala Ser Thr Leu 
305 3X0 3X5 320 

AAC AAG GCT AAG TCC ATT GTG GGT ACC ACT GCT TCA TTA CAG TAT ATG 1008 
Asn Lys Ala Lys Ser He Val Gly Thr Thr Ala Ser Leu Gin Tyr Met 
325 330 335 

AAA AAT GTT TTT AAA GAG AAA TAT CTC CTA TCT GAA GAT ACA TCT GGA 1056 
Lys Asn Val Phe Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly 
340 345 350 

AAA TTT TCG GTA GAT AAA TTA AAA TTT GAT AAG TTA TAC AAA ATG TTA 1104 
Lys Phe Ser Val Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu 
355 360 365 

ACA GAG ATT TAC ACA GAG GAT AAT TTT GTT AAG TTT TTT AAA GTA CTT 1152 
Thr Glu He Tyr Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu 
370 375 380 

AAC AGA AAA ACA TAT TTG AAT TTT GAT AAA GCC GTA TTT AAG ATA AAT 1200 
Asn Arg Lys Thr Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys He Asn 

385 390 395 400 

ATA GTA CCT AAG GTA AAT TAC ACA ATA TAT GAT GGA TTT AAT TTA AGA 1248 
He Val Pro Lys Val Asn Tyr Thr He Tyr Asp Gly Phe Asn Leu Arg 
405 410 415 

AAT ACA AAT TTA GCA GCA AAC TTT AAT GGT CAA AAT ACA GAA ATT AAT 1296 
Asn Thr Asn Leu Ala Ala Asn Phe Asn Gly Gin Asn Thr Glu He Asn 
420 425 430 

AAT ATG AAT TTT ACT AAA CTA AAA AAT TTT ACT GGA TTG TTT GAA TTT 1344 
Asn Met Asn Phe T2ir Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe 
435 440 445 

TAT AAG TTG CTA TGT GTA AGA GGG ATA ATA ACT TCT AAA ACT AAA TCA 1392 
Tyr Lys Leu Leu Cys Val Arg Gly He He Thr Ser Lys Thr Lys Ser 
450 455 460 

TTA GAT AAA GGA TAC AAT AAG GCA TTA AAT GAT TTA TGT ATC AAA GTT 1440 
Leu Asp Lys Gly Tyr Asn Lys Ala Leu Asn Asp Leu Cys He Lys Val 
465 470 475 480 
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25 



35 



40 



50 



AAT AAT TGG GAC TTG TTT TTT AGT CCT TCA GAA GAT AAT TTT ACT AAT i4flA 
Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Giu Aao Asn Phe Thr Asn 
485 490 ' 495 

GAT CTA AAT AAA GGA GAA GAA ATT ACA TCT GAT ACT AAT ATA GAA GCA 1536 
Asp Leu Asn Lys Gly Glu Glu lie Thr Ser Asp Thr Asn Ue Glu Ala 
500 505 SXO 

GCA GAA GAA AAT ATT AGT TTA GAT TTA ATA CAA CAA TAT TAT TTA ACC 1584 
Ala Glu Glu Asn lie Ser Leu Asp Leu lie Gin Gin Tyr Tyr Leu Thr 
515 520 525 

TTT AAT TTT GAT AAT GAA CCT GAA AAT ATT TCA ATA GAA AAT CTT TCA 1632 
Phe Asn Phe Asp Asn Glu Pro Glu Asn lie Ser lie Glu Asn Leu Ser 
530 535 540 

AGT GAC ATT ATA GGC CAA TTA GAA CTT ATG CCT AAT ATA GAA AGA TTT 1680 
ser Asp He He Gly Gin Leu Glu Leu Met Pro Asn He Glu Arg Phe 
545 550 555 560 

CCT AAT GGA AAA AAG TAT GAG TTA GAT AAA TAT ACT ATG TTC CAT TAT 1728 
Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr 
20 565 570 575 

CTT CGT GCT CAA GAA TTT GAA CAT GGT AAA TCT AGG ATT GCT TTA ACA 1776 
Leu Arg Ala Gin Glu Phe Glu His Gly Lys Ser Arg He Ala Leu Thr 
580 585 590 



10 



15 



AAT TCT GTT AAC GAA GCA TTA TTA AAT CCT AGT CGT GTT TAT ACA TTT 1824 
Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe 
595 600 605 

TTT TCT TCA GAC TAT GTA AAG AAA GTT AAT AAA GCT ACG GAG GCA GCT 1872 
Phe ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala 
610 615 620 

ATG TTT TTA GGC TGG GTA GAA CAA TTA GTA TAT GAT TTT ACC GAT GAA 1920 
Met Phe Leu Gly Trp Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu 
625 630 635 640 

ACT AGC GAA GTA AGT ACT ACG GAT AAA ATT GCQ GAT ATA ACT ATA ATT 1968 
Thr Ser Glu Val Ser Thr Thr Asp Lys He Ala Asp He Thr He He 
645 650 655 

ATT CCA TAT ATA GGA CCT GCT TTA AAT ATA GGT AAT ATG TTA TAT AAA 2016 
He Pro Tyr He Gly Pro Ala Leu Asn He Gly Asn Net Leu Tyr Lys 
660 665 670 

GAT GAT TTT GTA GGT GCT TTA ATA TTT TCA GGA GCT GTT ATT CTO TTA 2064 
Asp Asp Phe Val Gly Ala Leu He Phe Ser Gly Ala Val He Leu Leu 
675 680 685 



GAA TTT ATA CCA GAG ATT GCA ATA CCT GTA TTA GGT ACT TTT GCA CTT 2112 
Glu Phe He Pro Glu He Ala He Pro Val Leu Gly Thr Phe Ala Leu 
45 €90 695 700 

GTA TCA TAT ATT GCG AAT AAG GTT CTA ACC GTT CAA ACA ATA GAT AAT 2160 
Val Ser Tyr He Ala Asn Lys Val Leu Thr Val Gin Thr He Am Asn 
705 710 715 720 

GCT TTA AGT AAA AGA AAT GAA AAA TGG GAT GAG GTC TAT AAA TAT ATA 2208 
Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr He 
725 730 735 

GTA ACA AAT TGG TTA GCA AAG GTT AAT ACA CAG ATT GAT CTA ATA AGA 2256 
Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gin He Asp Leu He Arg 
740 74S 750 

55 
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10 



15 



20 



25 



30 



AAA AAA ATG AAA GAA GCT TTA GAA AAT CAA GCA C3AA GCA ACA AAG GCr 
Lys Lys Met Lys Glu Ala Leu Glu Asn Gin Ala Glu Ala Thr JJS S 

*76S 

tT* ^ I^"^ "^"^ ^ '^CT GAG GAA GAG AAA AAT AAT 

lie lie Asn Tyr Gin Tyr Asn Gin Tyr Thr Glu Glu Glu ^ 

780 

ATT AAT TTT AAT ATT GAT GAT TTA ACT TCG AAA CTT AAT GAG Trr ft-Pi> 
lie Asn Ph« Asn He A,p A.p Leu S« Ser Lya l" ^ ^ 

AAT AAA GCT ATG ATT AAT ATA AAT AAA TTT TTG AAT CAA TGC TCT CTT 
Asn Lys Ala Met lie A«n He A.n Lys Phe Leu Asn ^ S« leJ ^ 

805 910 93^5 

TCA TAT TTA ATG AAT TCT ATG ATC CCT TAT GGT GTT AAA CGG TTA rni. 

ser Tyr Leu Met Asn Ser Met He Pro Tyr Gly v" ^ a^ Ut 

835 

GAT TTT GAT GCT AGT CTT AAA GAT GCA TTA TTA AAG TAT ATA TAT GAT 
Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr lie ^ ^ 

B40 B4S 

AAT AGA GQA ACT TTA ATT GGT CAA GTA GAT AGA TTA AAA GAT AAA GTT 
Asn Arg Gly Thr Leu He Gly Gin V.l Asp Arg Leu ^ X^J ^ 

855 BSO 

AAT AAT ACA CTT AGT ACA GAT ATA CCT TTT CAG CTT TCC AAA TAG GTA 
Asn Aan Thr Leu Ser Thr Asp He Pro Phe Gin Leu Ser I^^ vl? 

GAT AAT CAA AGA TTA TTA TCT ACA TTT ACT GAA TAT ATT AAG Taa 
Asp Asn Gin Arg Leu Leu Ser Thr Phe Thr Glu ^r l" l^n ^ 
885 890 ^ 895 



2304 



2352 



2400 



2448 



2496 



2544 



2592 



2640 



2685 



35 



40 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 895 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



45 



50 



55 
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Gly Ser Pro Gly lie His Met Thr Ser Thr Arg Leu Gin Lys Leu Leu 
i 5 10 IS 

Glu Phe Glu Leu Pro Oly Thr Met Glu Phe Val Asn Lys Gin Phe Aan 
20 25 30 — 

Tyr Lys Asp Pro Val Asn Gly Val Asp He Ala Tyr He Lys He Pro 
35 40 45 

Lys Tyr Gly Gin Met Gin Pro Val Lye Ala Phe Lys He His Asn Lys 
50 55 go 

He Trp Val He Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Glv 
®S 70 75 80 

Asp Leu Asn Pro Pro Pro Glu Ala Lys Gin Val Pro Val Ser Tyr Tyr 
85 90 93 

Asp Ser Thr Tyr Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys 
100 lOS xiO 
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Gly val Thr Lys Leu Phe Glu Arg He Tyr Ser Thr Asp Leu Gly Arg 

Met Leu Leu Thr Ser He Val Arg Gly He Pro Phe Trp Gly Gly Ser 

Thr lie Asp Thr Glu Leu Lys Val He Asp Thr Asn Cys He Asn Val 
"5 150 xss 

He Gin Pro Asp Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val He 
1^5 170 175 

He Gly Pro Ser Ala Asp He He Gin Phe Glu Cys Lys Ser Phe Glv 
180 IBS 190 

His Glu yal Leu Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gin Tyr 
195 200 205 

'^^^ Gl^ Ser Leu Glu 

210 315 220 

Val Asp Thr Asn Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro 

230 235 ^ 240 

Ala val Thr Leu Ala His Glu Leu He His Ala Gly His Arg Leu Tyr 
245 250 255 

Gly He Ala He Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala 
280 2S5 270 

Tyr Tyr Glu Met Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr 
275 380 28S 

Phe Gly Gly His Asp Ala Lys Phe He Asp Ser Leu Gin Glu Asn Glu 
290 295 300 

Phe Arg Leu Tyr Tyr Tyr Asn Lys Phe Lys Asp He Ala Ser Thr Leu 

310 315 320 

Asn Lys Ala Lys Ser He Val Gly Thr Thr Ala Ser Leu Gin Tyr Met 
325 330 33S 

Lys Asn Val Phe Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly 
3.40 345 



350 



Lys Phe Ser Val Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu 
355 360 355 

Thr Glu He Tyr Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu 
370 375 300 

Asn Arg Lys Thr Tyr Leu Asn Phe Asp Lys Ala val Phe Lys He Asn 
385 390 395 400 

He Val Pro Lys Val Asn Tyr Thr He Tyr Asp Gly Phe Asn Leu Arg 
405 410 415 

Asn Thr Asn Leu Ala Ala Asn Phe Asn Gly Gin Asn Thr Glu He Asn 
420 425 430 

Asn Met Asn Phe Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe 
435 440 445 

Tyr Lys Leu Leu Cys Val Arg Gly He He Thr Ser Lys Thr Lys Ser 
450 455 4S0 
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Leu Asp Lys Gly Tyr Asn Lys Ala Leu Asn Asp Leu Cys He Lys Val 

Aan Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn 
485 490 

Asp Leu Asn Lys Gly Glu Glu He Thr Ser Asp Thr Asn He Glu Ala 
500 «;o^ i*,^ 



510 



Ala Glu Glu Asn He Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr 
515 520 525 

Phe Asn Phe Asp Asn Glu Pro Glu Asn He Ser He Glu Asn Leu Ser 
530 535 540 

Ser Asp He He Gly Qln Leu Glu Leu Met Pro Asn He Glu Arg Phe 
545 550 SS5 550 

Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tvr 
565 570 575 ^ 

Leu Arg Ala Glh Glu Phe Glu His Gly Lys Ser Arg He Ala Leu Thr 
580 565 590 

Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe 
595 600 605 

Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala 
610 615 S20 

Met Phe Leu Gly Trp Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu 
525 630 635 640 

Thr Ser Glu Val Ser Thr Thr Asp Lys He Ala Asp He Thr He He 
645 650 655 

He Pro Tyr He Gly Pro Ala Leu Aan He Gly Asn Met Leu Tyr Lvs 
660 665 670 

Asp Asp Phe Val Gly Ala Leu Ha Phe Ser Gly Ala Val He Leu Leu 
675 680 685 

Glu Phe He Pro Glu He Ala He Pro Val Leu Gly Thr Phe Ala Leu 
690 695 700 

Val Ser Tyr He Ala Asa Lys Val Leu Thr Val Gin Thr He Asp Asn 
705 710 715 720 

Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr He 
725 730 735 

Val Thx Asn Trp Leu Ala Lys Val Asn Thr Gin He Asp Leu He Arg 
740 745 750 

Lys Lys Met Lys Glu Ala I-eu Glu Asn Gin Ala Glu Ala Thr Lys Ala 
755 760 765 

He He Asn Tyr Gin Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn 
770 775 780 

He Asn Phe Asn He Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He 

785 790 795 800 

Asn Lys Ala Met He Asn He Asn Lys Phe Leu Asn Gin Cys Ser Val 
805 810 B15 
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Ser Tyr Leu Met Asn Ser Met lie Pro Tyr Gly Val Lys Arg Leu Glu 
a20 825 830 

Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr lie Tyr Aso 
335 840 845 ^ 

Asn Arg Gly Thr Leu lie Gly Gin Val Asp Arg Leu Lys Asp Lys Val 
850 855 860 

Asn Asn Thr Leu Ser Thr Asp lie Pro Phe Gin Leu Ser Lys Tyr Val 
865 870 875 880 

Asp Asn Gin Arg Leu Leu Ser Thr Phe Thr Glu Tyr lie Lys ♦ 
885 890 895 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2622 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(li) IVIOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION:!.. 2622 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
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GGA TCC ATG GAG TTC GTG AAC AAG CAG TTC AAC TAT AAG GAC CCT GTA 48 
Gly ser Met Glu Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val 

15 10 15 

^ AAC GGT GTT GAC ATT GCC TAG ATC AAA ATT CCA AAG TAC GGC CAG ATG 96 

Asn Gly Val Asp lie Ala Tyr lie Lys lie Pro Lys Tyr Gly Gin Met 
20 25 30 

CAG CCG GTG AAG GCT TTC AAG ATT CAT AAC AAA ATC TGO GTT ATT CCG 144 
Gin Pro Val Lys Ala Phe Lys lie His Asn Lys He Txp Val He Pro 
35 40 45 

GAA CGC GAT ACA TTT ACG AAC CCG GAA GAA GGA GAC TTG AAC CCG CCG 192 
Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp l,eu Asn Pro Pro 
50 55 60 

15 CCG GAA GCA AAG CAG GTG CCA GTT TCA TAC TAC GAT TCA ACC TAT CTO 240 

Pro Glu Ala Lya Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Vyr Leu 
6S 70 75 80 

AGC ACA GAC AAC GAG AAG GAT AAC TAC CTG AAG GGA GTG ACC AAA TTA 288 
Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu 
20 65 90 95 

TTC GAG CGT ATT TAT TCC ACT GAC CTG GGC CGT ATG CTG CTG ACC TCA 336 
Phe Glu Arg lie Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser 
100 105 110 

25 ATC GTC CGC GGA ATC CCA TTT TGG GGT GGC AGT ACC ATT GAC ACG . GAG 384 

He Val Arg Gly He Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu 

115 120 125 

TTG AAG GTT ATT GAC ACT AAC TGC ATT AAC GTG ATC CAA CCA GAC GGT 432 
Leu Lys Val He Asp Thr Asn Cys He Asn Val He Gin Pro Asp Gly 
30 1 30 135 140 



35 



40 



45 



50 



55 
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AGC TAC AGA TCT GAA GAA CTT AAC CTC GTA ATC ATC GGG CCC TCC GCG 4 80 

Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val He He Gly Pro ser Ala 
145 150 155 160 

5 GAC ATT ATC CAG TTT GAG TGC AAG AGC TTT GGC CAC GAA GTG TTG AAC 529 

Asp He He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn 
165 170 . 175 

CTG ACG CGT AAC GGT TAC GGC TCT ACT CAG TAC ATT CGT TTC AGC CCA 576 

Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro 
10 ISO 185 190 

GAC TTC ACG TTC GGT TTC GAG GAG AGC CTG GAG GTT GAT ACC AAC CCG 624 

Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro 
195 200 205 

15 CTG TTG GGT GCA GGC AAG TTC GCA ACT GAT CCA GCG GTG ACC CTG GCA 672 

Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala 
210 215 220 

$AC GAG CTG ATC CAC GCC GGT CAT CGT CTG TAT GGC ATT GCG ATT AAC 720 
His Glu Leu He His Ala Gly His Arg Leu Tyr Gly He Ala He Asn 
20 2 ^5 230 235 240 

CCG AAC CGC GTG TTC AAG GTT AAC ACC AAC GCC TAC TAC GAG ATG AGT 768 
Prd Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser 
24S 250 255 



25 



30 



35 



40 



45 



50 



GGT TTA GAA GTA AGC TTC GAG GAA CTG CGC ACG TTC GGT GGC CAT GAT 816 
Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp 
260 265 270 

GCG AAfe TTT ATC GAC AGC TTO CAG GAG AAC GAG TTC CGT CTG TAC TAC 864 
Ala Lyb Phe He Asp Ser Leu Gin Glu Asn Glu Phe Avg Leu Tyr Tyr 
.275 260 285 

TAC AAC AAG TTT AAA GAT ATT GCA- AGT ACA CTG AAC AAG GCT AAG TCC 912 
Tyr Asn I^ys Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser 
290 ^ 295 300 

ATT GTG GGt ACC ACT GCT TCA TTA CAG TAT ATG AAA AAT GTT TTT AAA 960 
He Val Gly Thr Thr Ala Ser Leu Oln Tyr Met Lys Asn Val Phe Lys 
305 310 315 320 

GAG AAA TAT CTC CTA TCT GAA GAT ACA TCT GGA AAA TTT TCG GTA GAT X008 
Glu Lys Tyr Leu Leu Ser Glu Asp Tbr Ser Gly Lys Phe Ser Val Asp 
325 330 335 

AAA TTA AAA TTT GAT AAG TTA TAC AAA ATG TTA ACA GAG ATT TAC ACA 1056 
Lys Leu Lys Phe, Asp Lys Leu Tyr Lys Met Leu Thr Glu He Tyr Thr 
340 345 350 

GAG GAT AAT TTT GTT AAG TTT TTT AAA GTA CTT AAC AGA AAA ACA TAT 1104 
Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr 
355 360 365 

TTG AAT TTT GAT AAA GCC GTA TTT AAO ATA AAT ATA GTA CCT AAG GTA 1152 
Leu Asn Phe Asp Ly^ Ala Val Phe Lys He Asn He Val Pro Lys Val 
370 375 380 

AAT TAC ACA ATA TAT GAT GGA TIT AAT TTA AGA AAT ACA AAT TTA GCA 1200 
Asa Tyr Thr He Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala 
385 290 395 4O0 

GCA AAC TTT AAT GGT CAA AAT ACA GAA ATT AAT AAT ATG AAT TTT ACT 1248 
Ala Asn Phe Asn Gly Glm Asn Thr Glu He Asn Asn Met Asn Phe Thr 
405 410 415 
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AAA CTA AAA AAT TTT ACT GGA TTG TTT GAA TTT TAT AAG TTG CTA TGT 1296 
L,y8 Leu Lya Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys 
420 425 430 

5 

GTA AGA GGG ATA ATA ACT TCT AAA ACT AAA TCA TTA GAT AAA GGA TAG 1344 
Val Arg Gly lie He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly TVr 

435 440 445 

AAT AAG GCA TTA. AAT GAT TTA TGT ATC AAA GTT AAT AAT TGG GAG TTG 1392 
10 Asn Lys Ala Leu Asn Asp Leu Cys lie Lys Val Asn Asn Trp Asp Leu 

450 455 460 

TTT TTT AGT CCT TCA GAA GAT AAT TTT ACT AAT GAT CTA AAT AAA GGA 1440 
Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly 
465 470 475 480 

GAA GAA ATT ACA TCT GAT ACT AAT ATA GAA GCA GCA GAA GAA AAT ATT 1488 
Glu Glu He Thr Ser Asp Thr Asn He Glu Ala Aia Glu Glu Asn He 
485 490 495 

AGT TTA GAT TTA ATA CAA CAA TAT TAT TTA ACC TTT AAT TTT GAT AAT 1536 
Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr Phe Asn Phe Asp Asn 
500 505 510 

GAA CCT GAA AAT ATT TCA ATA GAA AAT CTT TCA AGT GAC ATT ATA GGC 1584 
Glu Pro Glu Asn He Ser He Glu Asn Leu Ser Ser Asp He He Gly 
515 520 525 

CAA TTA GAA CTT ATG CCT AAT ATA GAA AGA TTT CCT AAT GGA AAA AAG 1632 
Gin Leu Glu Leu Met Pro Asn He Glu Arg Phe Pro Asn Gly Lys Lys 
530 535 540 

TAT GAG TTA GAT AAA TAT ACT ATG TTC CAT TAT CTT CGT GCT CAA GAA 1680 
Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gin Glu 
30 545 550 555 560 

TTT GAA CAT GGT AAA TCT AGG ATT GCT TTA ACA AAT TCT GTT AAC GAA 1728 
Phe Glu His Gly Lys Ser Arg He Ala Leu Thr Asn Ser Val Asn Glu 
565 S70 575 



20 



25 



35 



40 



GCA TTA TTA AAT CCT AGT CGT GTT TAT ACA TTT TTT TCT TCA GAC TAT 1776 
Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr 
580 58S 590 

GTA AAG AAA GTT AAT AAA GCT ACG GAG GCA GCT ATG TTT TTA GGC TGG 1824 
Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp 
595 ^00 605 

GTA GAA CAA TTA OTA TAT GAT TTT ACC GAT GAA ACT AGC GAA GTA AGT 1872 
Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu Thr ser Glu Val Ser 
610 S15 620 

ACT ACG GAT AAA ATT GCG GAT ATA ACT ATA ATT ATT CCA TAT ATA GGA 1920 
Thr Thr Asp Lys He Ala A«p He Thr He He He Pro Tyr He Gly 
625 630 635 640 

CCT GCT TTA AAT ATA GGT AAT ATG TTA TAT AAA GAT GAT TTT GTA GGT 1968 
Pro Ala Leu Asn He Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly 
645 650 655 



50 



55 



GCT TTA ATA TTT TCA GGA GCT GTT ATT CTG TTA GAA TTT ATA CCA GAG 2016 
Ala Leu He Phe Ser Gly Ala Val He Leu Leu Glu Phe He Pro Glu 
660 665 670 

ATT GCA ATA CCT GTA TTA GGT ACT TTT GCA CTT GTA TCA TAT ATT GCG 2064 
He Ala He Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr He Ala 
675 S80 685 
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AAT AAG GTT CTA ACC GTT CAA ACA ATA GAT AAT GCT TTA AGT AAA AGA 2112 
Asn hys Val Leu Thr Val Gin Thr lie Asp Asn Ala Leu Ser Lys Arq 
690 695 700 

5 AAT GAA AAA TGG GAT GAG GTC TAT AAA TAT ATA GTA ACA AAT TGG TTA 2160 

Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr He Val Thr Asn Trp Leu 
705 710 715 720 

GCA AAG GTT AAT ACA CAG ATT GAT CTA ATA AGA AAA AAA ATG AAA GAA 2208 
Ala Lys Val Asn Thr Gin He Asp l>eu He Arg Lys Lys Met Lys Glu 
^0 725 730 735 

GCT TTA GAA AAT CAA GCA GAA GCA ACA AAG GCT ATA ATA AAC TAT CAG 2256 
Ala l»eu Glu Asn Gin Ala Glu Ala Thr Lys Ala lie He Asn Tyr Gin 
740 745 750 

15 TAT AAT CAA TAT ACT GAO GAA GAG AAA AAT AAT ATT AAT TTT AAT ATT 2304 

Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn He Asn Phe Asn He 
755 760 765 

GAT GAT TTA AGT TCG AAA CTT AAT GAG TCT ATA AAT AAA GCT ATG ATT 2352 
Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He Asn Lys Ala Met He 

20 770 775 780 

AAT ATA AAT AAA TTT TTG AAT CAA TGC TCT GTT TCA TAT TTA ATG AAT 2400 
Asn He Asn Lys Phe Leu Asn Gin Cys Ser Val Ser Tyr Leu Met Asn 
785 790 795 800 



25 



30 



TCT ATG ATC COT TAT QGT GTT AAA CGG TTA GAA GAT TTT GAT GCT AGT 2448 
Ser Met He Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp Ala Ser 
805 810 815 

CTT AAA GAT GCA TTA TTA AAG TAT ATA TAT GAT AAT AGA GGA ACT TTA 24 96 

Leu Lys Asp Ala Leu Leu Lys Tyr Tie Tyr Asp Asn Arg Gly Thr Leu 
820 825 830 

ATT GOT CAA GTA GAT AGA TTA AAA GAT AAA GTT AAT AAT ACA CTT AGT 2544 
He Gly Gin Val Asp Arg Leu Lya Asp Lys Val Asn Asn Thr Leu Ser 
835 840 845 

ACA GAT ATA CCT TTT CAG CTT TCC AAA TAG GTA GAT AAT CAA AGA TTA 2592 
35 Thr Asp He Pro Phe Gin Leu Ser Lys Tyr Val Asp Asn Gin Arg Leu 
850 855 860 

TTA TCT ACA TTT ACT GAA TAT ATT AAG TAA 2622 

Leu Ser Thr Phe Thr Glu Tyr He Lys * 
865 870 

40 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

''s (A) LENGTH: 874 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



50 



(li) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



55 
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Gly Ser Met Glu Phe Val Asn hys Gin Phe Asn Tyr Lys Asp Pro Val 
15 10 IS 

Asn Gly Val Asp lie Ala Tyr lie Lys lie Pro Lys Tyr Gly Gin Met 
20 25 30 

Gin Pro Val Lys Ala Phe Lys He His Asn Lys He Trp Val He Pro 
35 40 45 
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Glu Arg Asp Thr Phe Thr Asn Pro Glu GIu Gly Asp Leu Asn Pro Pro 
50 55 60 

Pro Glu Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp ser Thr Tyr Leu 
65 70 75 80 

Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu 
85 90 95 

Phe Glu Arg lie Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser 
100 105 110 

He Val Arg Gly He Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu 
115 120 125 

Leu Lys Val He Asp Thr Asn Cys He Asn Val He Gla Pro Asp Gly 
130 135 140 

Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val He He Gly Pro Ser Ala 
145 ISO 155 160 

Asp He He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn 
165 170 175 

Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro 
180 165 190 

Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro 
195 200 205 

Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala 
210 215 220 

His Glu Leu He His Ala Gly His Arg Leu Tyr Gly He Ala Zle Asn 
225 230 335 240 

Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Mec Ser 
245 250 255 

Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp 
260 265 270 

Ala Lys Phe Zle Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr 
275 . 280 285 

Tyr Asn Lys Phe Lys. Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser 

290 295 300 

He Val Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys 
305 310 315 320 

Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys -Phe Ser Val Asp 
325 330 335 

Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu He Tyr Thr 
340 345 350 

Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr 
355 360 365 

Leu Asn Phe Asp Lys Ala Val Phe Lys He Asn He Val Pro Lys Val 

370 375 380 

Asn Tyr Thr He Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala 
385 390 395 400 
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Ala Asn Phe Asn Gly Gin Asn Thr Glu He Asn Asn Met Asn Phe Thr 
405 410 415 

Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr l*ys Leu Leu Cys 
420 425 430 

Val Arg Gly He He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr 
435 440 445 

Asn Lys Ala Leu Asn Asp Leu Cys He Lys Val Asn Asn Trp Asp Leu 
450 455 460 

Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly 
455 470 475 480 

Glu Glu He Thr Ser Asp Thr Asn He Glu Ala Ala Glu Glu Asn He 
485 490 495 

Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr Phe Asn Phe Asp Asn 

500 505 510 

Glu Pro Glu Asn He Ser He Glu Asn Leu Ser Ser Asp He He Gly 
515 520 525 

Gin Leu Glu Leu Met Pro Asn He Glu Arg Phe Pro Asn Gly Lys Lys 
530 535 540 

Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gin Glu 
545 550 555 560 

Phe Glu His Gly Lys Ser Arg He Ala Leu Thr Asn Ser Val Asn Glu 
565 570 575 

Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr 
580 585 590 

Val Lye Lye Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp 
595 600 605 

Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser 
610 615 620 

Thr Thr Asp Lys He Ala Asp He Thr He He He Pro Tyr He Gly 
625 630 635 640 

Pro Ala Leu Asn He Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly 
£45 650 655 

Ala Leu He Phe Ser Gly Ala Val He Leu Leu Glu Phe He Pro Glu 
660 665 670 

He Ala He Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr He Ala 
675 680 665 

Asn Lys Val Leu Thr Val Gin Thr He Asp Asn Ala Leu Ser Lys Arg 
€90 695 700 

Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr He Val Thr Asn Trp Leu 
705 710 715 720 

Ala Lys Val Asn Thr Gin He Asp Leu He Arg Lys Lys Met Lys Glu 

725 730 735 

Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala He He Asn Tyr Gin 
740 745 750 
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Tyr Asn Gin Tyx Thr Glu Glu Glu Lys Asn Asn He Asn Phe Asn He 
75b 760 

Asp Asp Leu ser Ser Lys Leu Asn Glu Ser He Asn Lys Ala Met He 

780 

Asn lie Asn Lys Phe Leu Asn Gin Cys Ser Val Ser Tyr Leu Met Asn 

790 79$ 

ser Met lie Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp Ala Ser 
805 BIO 

Leu Lys Asp Ala Leu Leu Lys Tyr He Tyr Asp Asn Arg Gly Thr Leu 
820 825 

He Gly Gin Val Asp Arg Leu Lys Asp Lys Val Asn Asn Thr Leu Ser 
835 840 

Thr Asp He Pro Phe Gin Leu Ser Lys Tyr Val Asp Asn Gin Arg Leu 

850 855 

Leu Ser Thr Phe Thr Glu Tyr He Lys * 
865 870 

(2) INFORMATION FOR SEQ ID NO: 7: 
(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2613 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(Ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION:1 . .2613 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
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15 



ATG CCA TTT GTT AAT AAA CAA TTT AAT TAT AAA GAT CCT GTA AAT rr-r 
Met Pro Phe Val Asn Lys Gin Phe Asn Tyr Lya Asp " 
3. 5 10 15 

GTT GAT ATT GCT TAT ATA AAA ATT CCA AAT GCA GGA CAA ATG CAA 
Val A8P lie Ala Tyr He Lys He Pro Asn Ala Sly Gin Met ^ P« 
20 2S 



30 



GTA AAA GCT TTT AAA ATT CAT AAT AAA ATA TGG GTT ATT CCA GAA AGA 
Val Lys Ala Phe Lys He His Asn Lys He Trp Val lie Pro Glu aS 
35 40 ^ 



45 



96 



144 



GAT ACA TTT ACA AAT CCT GAA GAA GGA GAT TTA AAT CCA CCA CCA GAA i 
A8P Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn JrS Pro ^ 
50 55 go 

GCA AAA CAA GTT CCA GTT TCA TAT TAT GAT TCA ACA TAT TTA AGT ACA o^rx 
Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr TyJ Su ^r ?S ^ 

'^^ 75 80 

GAT AAT GAA AAA GAT AAT TAT TTA AAO GGA GTT ACA AAA TTA TTT GAG 288 
20 Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 

85 90 95 



25 



30 



35 



40 



45 



50 



55 
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AGA ATT TAT TCA ACT GAT CTT GGA AGA ATG TTG TTA ACA TCA ATA GTA 336 
Arg He Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser He Val 
100 105 110 

^ AGG GGA ATA CCA TTT TGG GGT GGA AGT ACA ATA GAT ACA GAA TTA AAA 384 

Arg Gly He Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu Leu Lys 
115 120 125 

GTT ATT GAT ACT AAT TGT ATT AAT GTG ATA CAA CCA GAT GGT AGT TAT 4 32 

Val He Asp Thr Asn Cys He Asn Val He Gin Pro Asp Gly Ser Tyr 
^0 IZO 135 140 

AGA TCA GAA GAA CTT AAT CTA GTA ATA ATA GGA CCC TCA GCT GAT ATT 480 

Arg Ser Glu Glu Leu Asn Leu Val He He Gly Pro Ser Ala Asp He 
145 150 155 160 

15 ATA CAG TTT GAA TGT AAA AGO TTT GGA CAT GAA GTT TTG AAT CTT ACQ 528 

He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 170 175 

CGA AAT GGT TAT GGC TCT ACT CAA TAC ATT AGA TTT AGC CCA GAT TTT 576 
Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 
20 180 185 190 

ACA TTT GGT TTT GAG GAG TCA CTT GAA GTT GAT ACA AAT CCT CTT TTA 624 
Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
19S 200 205 



25 



30 



GGT GCA GGC AAA TTT GCT ACA GAT CCA GCA GTA ACA TTA GCA CAT GAA 672 
Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 
210 215 220 

CTT ATA CAT GCT GGA CAT AGA TTA TAT GGA ATA GCA ATT AAT CCA AAT 720 
Leu He His Ala Gly His Arg Leu Tyr Gly Xlm Ala He Asn Pro Asn 
225 230 235 240 

AGG GTT TTT AAA GTA AAT ACT AAT GCC TAT TAT GAA ATG AGT GGG TTA 768 
Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
245 250 255 

GAA GTA AGC TTT GAG GAA CTT AGA ACA TTT GGG GGA CAT GAT GCA AAG 616 
35 Glu Val Ser Phe Olu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 
260 265 270 

TTT ATA GAT AGT TTA CAG GAA AAC GAA TTT CGT CTA TAT TAT TAT AAT 864 

Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
275 280 285 

40 

AAG TTT AAA GAT ATA GCA AGT ACA CTT AAT AAA GCT AAA TCA ATA GTA 912 

Lys Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser He Val 
290 295 300 



45 



GGT ACT ACT GCT TCA TTA CAG TAT ATG AAA AAT GTT TTT AAA GAG AAA 960 
Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 310 315 320 

TAT CTC CTA TCT GAA GAT ACA TCT GGA AAA TTT TCG GTA GAT AAA TTA 1008 
Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 
325 330 335 

AAA rrr gat AAG TTA TAC AAA ATG TTA ACA gag ATT TAC ACA gag gat 1056 
Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu He Tyr Thr Glu Asp 
340 345 350 

AAT TTT GTT AAG TTT TTT AAA GTA CTT AAC AGA AAA ACA TAT TTG AAT 1104 
Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 
55 355 360 365 
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TTT GAT AAA GCC GTA TTT AAG ATA AAT ATA GTA CCT AAG GTA AAT TAG 1152 

Phe Asp hys Ala Val Phe Lys lie Asn lie Val Pro Lys Val Asn Tyr 
370 375 380 

5 

ACA ATA TAT GAT GGA TTT AAT TTA AGA AAT ACA AAT TTA GCA GCA AAC 1200 
Thr lie Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
3BS 390 395 400 

TTT AAT GGT CAA AAT ACA GAA ATT AAT AAT ATG AAT TTT ACT AAA CTA 1248 
10 Phe Asn Gly Gin Asn Thr Glu lie Asn Asn Met Asn Phe Thr hys Leu 

405 410 415 

AAft AAT TTT ACT GGA TTG TTT GAA TTT TAT AAG TTG CTA TGT GTA AGA 1296 
Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Aro 
420 425 430 

GGG ATA ATA ACT TCT AAA ACT AAA TCA TTA GAT AAA GGA TAC AAT AAG 1344 
Gly He He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 445 



20 



25 



35 



40 



45 



50 



55 



GCA TTA AAT GAT TTA TGT ATC AAA GTT AAT AAT TGG GAC TTG TTT TTT 1392 
Ala Leu Asn Asp Leu Cys He Lys Val Asn Asn Trp Asp Leu Phe Phe 
450 455 460 

AGT CCT TCA GAA GAT AAT TTT ACT AAT GAT CTA AAT AAA GGA GAA GAA 1440 
Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu 
465 470 475 480 

ATT ACA TCT GAT ACT AAT ATA GAA OCA OCA GAA GAA AAT ATT AGT TTA 1488 
He Thr Ser Asp Thr Ash He Glu Ala Ala Glu Glu Asn He Ser Leu 
485 490 495 



GAT TTA ATA CAA CAA TAT TAT TTA ACC TTT AAT TTT GAT AAT GAA CCT 1536 
Asp Leu He Gin Gin Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro 
30 500 505 510 

GAA AAT ATT TCA ATA GAA AAT CTT TCA AGT GAC ATT ATA GGC CAA TTA 1584 
Glu Asn He Ser He Glu Asn Leu Ser Ser Asp He He Gly Gin Leu 
515 520 52S 



GAA CTT ATG CCT AAT ATA GAA AGA TTT CCT AAT GGA AAA AAG TAT GAG 1632 
Glu Leu Net Pro Asn He Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu 
530 535 540 

TTA GAT AAA TAT ACT ATG TTC CAT TAT CTT COT GCT CAA GAA TTT GAA 1680 
Leu Asp Lys Tyz Thr Mee Phe His Tyr Leu Arg Ala Gin Glu Phe Glu 
545 550 555 560 

CAT GGT AAA TCT AGG ATT GCT TTA ACA AAT TCT GTT AAC GAA OCA TTA 1728 
Kis Gly Lys Ser Arg He Ala Leu Thr Asn Ser Val Asn Glu Ala Leu 
565 570 525 

TTA AAT CCT AGT CGT GTT TAT ACA TTT TTT TCT TCA GAC TAT GTA AAG 1776 
Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys 
580 585 590 

AAA GTT AAT AAA GCT ACG GAG GCA GCT ATG TTT TTA GGC TGG GTA GAA 1824 
Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu 
595 600 605 

CAA TTA GTA TAT GAT TTT ACC GAT GAA ACT AGC GAA GTA AGT ACT ACG 1872 
Gin Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr 
610 615 620 

GAT AAA ATT GCG GAT ATA ACT ATA ATT ATT CCA TAT ATA GGA CCT GCT 1920 
Asp Lys He Ala Asp He Thr He He He Pro Tyr He Gly Pro Ala 
625 ' 630 635 640 
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TTA AAT ATA GGT AAT ATG TTA TAT AAA GAT GAT TTT GTA GGT GCT TTA 1968 
Leu Asn lie Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu 
645 650 655 

^ ATA TTT TCA GGA GCT GTT ATT CTO TTA GAA TTT ATA CCA GAG ATT GCA 2 016 

lie Phe Ser Gly Ala Val He Leu Leu Glu Phe He Pro Glu lie Ala 
660 665 670 

ATA CCT GTA TTA GGT ACT TTT GCA CTT GTA TCA TAT ATT GCG AAT AAG 2064 
He Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr He Ala Asn Lys 
675 680 685 

GTT CTA ACC GTT CAA ACA ATA GAT AAT GCT TTA AGT AAA AGA AAT GAA 2112 
Val Leu Thr Val Gin Thr He Asp Asn Ala Leu Ser Lys Arg Asn Glu 
690 695 700 

^5 AAA TGG GAT GAG GTC TAT AAA TAT ATA GTA ACA AAT TGG TTA GCA AAG 2160 

Lys Trp Asp Glu Val Tyr Lys Tyr He Val Thr Asn Trp Leu Ala Lys 
70S 710 715 720 

GTT AAT ACA CAG ATT GAT CTA ATA AGA AAA AAA ATG AAA GAA GCT TTA 2208 
Val Asn Thr Gin He Asp Leu Zla Arg Lys Lys Met Lys Glu Ala Leu 
20 725 730 735 

GAA AAT CAA GCA GAA GCA ACA AAG GCT ATA ATA AAC TAT CAG TAT AAT 2256 
Glu Asn Gin Ala Glu Ala Thr Lys Ala He He Asn Tyr Gin Tyr Asn 
740 745 750 



25 



30 



CAA TAT ACT GAG GAA GAG AAA AAT AAT ATT AAT TTT AAT ATT GAT GAT 2304 

Gin Tyr Thr Glu Glu Glu Lys Asn Asn He Asn Phe Asn He Asp Asp 
755 760 765 

TTA AGT TCG AAA CTT AAT GAG TCT ATA AAT AAA GCT ATG ATT AAT ATA 2352 
Leu Ser Ser Lys Leu Asn Glu Ser He Asn Lys Ala Met: He Asn He 
770 775 780 

AAT AAA TTT TTG AAT CAA TGC TCT GTT TCA TAT TTA ATG AAT TCT ATG 2400 
Asn Lys Phe Leu Asn Gin Cys Ser Val Ser Tyr Leu Met Asn Ser Met 
785 790 795 800 

ATC CCT TAT GGT' GTT AAA CGO TTA GAA GAT TTT GAT GCT AGT CTT AAA 2448 
35 He Pro Tyr Gly Val Lys Arg Leu Olu Asp Phe Asp Ala Ser Leu Lys 

805 810 815 

GAT GCA TTA TXA AAG TAT ATA TAT GAT AAT AGA GGA ACT TTA ATT GGT 2496 
Asp Ala Leu Leu Lys Tyr He Tyr Asp Asn Arg Gly Thr Leu He Gly 
820 825 830 



40 



CAA OTA GAT AGA TTA AAA GAT AAA GTT AAT AAT ACA CTT AGT ACA GAT 2544 
Gin Val Asp Arg Leu Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp 
835 840 845 



ATA CCT TTT CAG CTT TCC AAA TAG GTA GAT AAT CAA AGA TTA TTA TCT 2592 
He Pro Phe Gin Leu Ser Lys Tyr Val Asp Asn Gin Arg Leu Leu Ser 
45 850 855 860 

ACA TTT ACT GAA TAT ATT AAG 2613 
Thr Phe Thr Glu Tyr He Lys 
865 870 

50 (2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 871 amino acids 
55 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 



Met Pro Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 

val Asp He Ala Tyr He Lys He Pro Asn Ala Gly Gin Met Gin Pro 

25 20 
val Lys Ala Phe Lys He His Asn Lys He Trp Val He Pro Glu Arg 

Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 



60 



Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 

'5 go 

Asp Ann Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Olu 
85 90 35 

Arg He Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser He Val 



HO 



Arg Gly He Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu Leu Lys 



125 



Val He Asp Thr Asn Cys He Asn Val He Gin Pro Asp Gly Ser Tyr 

U5 140 

Arg Ser Clu Glu Leu Asn Leu Val He He Gly Pro Ser Ala Asp He 
"5 150 «- j^g^ 

He Gin Phe Glu C^s Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 

170 

Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 

2.90 

Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 



205 



Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 

^AW 215 220 

Leu lie His Ala Gly His Arg Leu Tyr Gly He Ala lie Asn Pro Asn 

230 235 240 

Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met: Ser Gly Leu 
245 2S0 255 

Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 
260 265 270 

Phe He Asp ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
2V5 2B0 285 

Lys Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys ser He Vai 
290 29S ^ 

Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 310 ' 

Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lya Phe Ser Val Asp Lys Leu 
325 330 *^ 
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Lys Phe Asp Lys Leu Tyr hys Met Leu Thr Glu lie Tyr Thr Glu Asp 

Asn Phe yal Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 
355 360 

Phe Asp Lys Ala Val Phe Lys lie Asn He Val Pro Lys Val Asn Tvr- 
370 37S 3go 

Thr lie Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 

385 390 

Phe Asn Gly Gin Asn Thr Glu He Asn Asn Met Asn Phe Thr Lva Leu 
405 

Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Ar« 
420 425 430 ^ 

Gly He lie Thr Sex Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 445 

4I0 455 

Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu 
4fi5 470 475 ^g^j 

He Thr Ser Asp Thr Asn He Glu Ala Ala Glu Glu Asn He Ser Leu 
485 490 495 

Asp Leu He Gin Gin Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro 
SOO 505 510 

Glu Asn He Ser He Glu Asn Leu Ser Ser Asp He He Gly Gin Leu 
515 520 525 

Glu Leu Met Pro Asn He Glu Arg phe Pro Asn Gly Lys Lys Tyr Glu 
530 535 540 

Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gin Glu Phe Glu 
550 555 

His Gly Lys Ser Arg He Ala Leu Thr Asn Ser Val Asn Glu Ala Leu 
S65 570 575 

Leu Aan Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys 
580 585 590 

Lys Val Asn Lys Ala Thr Glu Ala Ala Net Phe Leu Gly Txp Val Glu 
595 600 go5 

Gla Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr 
610 6X5 S20 

Asp Lys He Ala Asp He Thr He He He Pro Tyr He Gly Pro Ala 
"5 630 635 640 

Leu Asn He Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu 
645 650 ^ 655 

He Phe Ser Gly Ala Val He Leu Leu Glu Phe He Pro Glu He Ala 
560 665 €70 

He Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr He Ala Asn Lys 
575 680 535 
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Val Leu Thr Val Gin Thr lie Asp Asn Ala Leu Ser Lys Arg Asn Glu 
osu 695 700 

Lys Trp Aap Glu Val Tyr Lys Tyr He val Thr Asn Trp Leu Ala Lys 

70S 710 

Val Asn Thr Gin lie Asp Leu He Arg Lys Lys Met Lys Glu Ala Leu 
725 730 735 

Glu Asn Gin Ala Glu Ala Thr Lys Ala He He Asn Tyr Gin Tyr Asn 
740 745 750 

Gin Tyr Thr Glu Glu Glu Lys Asn Asn He Asn Phe Asn He Asp asd 
755 760 *^ 

Leu ser Ser Lys Leu Asn Glu Ser He Asn Lya Ala Met He Asn He 

770 775 790 

Asn Lys Phe Leu Asn Gin Cys Scr Val Ser Tyr Leu Met Asn Ser Met 

795 BOO 

He Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu Lys 
805 810 

Asp Ala Leu Leu Lys Tyr He Tyr Asp Asn Arg Gly Thr Leu He Glv 
820 825 830 

Gin Val Asp Arg Leu Lya Aap Lys Val Asn Asn Thr Leu Ser Thr Asp 
835 840 a45 

?f? Gin Arg Leu Leu Ser 

650 855 S60 

Thr Phe Thr Glu Tyr He Lys 
865 870 



(2) INFORMATION FOR SEQ ID NO: 9: 
(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2628 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION:!. .2628 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



ATG CAG TTC GTG AAC AA6 CAG TtC AAC TAT AAO GAC CCT 6TA AAC GOT 48 
Met Gin Phe Val Asn Lys Gin Ph« Aan Tyr Lys Asp Pro Val Asn Gly 
15 10 15 

GTT GAC ATT GCC TAC ATC AAA ATT CCA AAC GCC GGC CAG ATG CAG CCG 96 
Val Asp He Ala Tyr He Lys He Pro Asn Ala Gly Gin Met Gin Pro 
20 as 30 

GTG AAG GCT TTC AAG ATP CAT AAC AAA ATC TGG GTT ATT CCG GAA CGC 144 
Val Lys Ala Phe Lys He His Asn Lys He Trp Val He Pro Glu Arg 
3S 40 45 

GAT ACA TTT ACG AAC CCG GAA GAA GGA GAC TTC AAC CCG CCG CCG GAA 192 
Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
SO 55 60 
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GCA AAG CAG GTG CCA GTT TCA TAG TAG GAT TCA ACC TAT CTG AGC ACA 240 
Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
65 70 75 80 

GAC AAC GAG AAG GAT AAC TAG GTG AAG GGA GTG ACC AAA TTA TTC GAG 286 
Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 
85 90 95 

CGT ATT TAT TCC ACT GAC CTG GGC CGT ATO CTG CTG ACC TCA ATC GTG 336 
Arg lie Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser lie Val 
100 105 110 

CGC GGA ATC CCA TTT TGG GGT GGC AGT ACC ATT GAC ACQ GAG TTG AAG 384 

Arg Gly lie Pro Phe Trp Gly Gly Ser Thr lie Asp Thr Glu Leu Lys 
lis 120 125 

GTT ATT GAC ACT AAC TGC ATT AAC GTG ATC CAA CCA GAC GGT AGC TAG 432 
Val lie Asp Thr Asn Cye He Aen Val lie Gin Pro Asp Gly Ser Tyr 
130 135 140 

AGA TCT GAA GAA CTT AAC CTC 6TA ATC ATC GGG CCC TCC GCG GAC ATT 480 
Arg Ser Glu Glu Leu Asn Leu Val lie He Gly Pro Ser Ala Asp He 
145 X50 155 160 

ATC CAG TTT GAG TGC AAG AGC TTT GGC GAG GAA GTG TTG AAG GTG ACG 526 
He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 170 175 

CGT AAC GGT TAG GGC TCT ACT CAG TAG ATT CGT TTC AGC CCA GAC TTC 576 
25 Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 

180 laS 190 

ACQ TTC GGT TTC GAG GAG AGC CTG GAG GTT GAT ACC AAC CCO CTG TTG 624 
Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
195 200 205 

GGT GCA GGC AAG TtC GCA ACT GAT CCA GCG GTG ACC CTG GCA CAC GAG 672 

Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala val Thr Leu Ala His Glu 
210 215 220 

CTG ATC CAC GGC GGT CAT CGT CTG TAT GGC ATT GCG ATT AAC GCG AAG 720 
Ijcu He His Ala Gly His Arg Leu Tyr Gly He Ala He Asn Pro Asn 
35 225 230 235 240 

CGC GTG TTC AAG GTT AAC ACC AAC GCC TAC TAG GAG ATG AGT GGT TTA 768 
Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Net Ser Gly Leu 

245 250 255 



10 



15 



20 



40 



45 
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GAA GTA AGC TTC GAG GAA CTO CGC ACG TTC GGT GGC CAT GAT GCG AAG 816 
Glu Val Ser Phe Clu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 

260 265 270 

TTT ATC GAC AGC TTG CAG GAG AAC GAG TTC CGT CTG TAC TAC TAC AAC 864 
Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
275 280 285 

AAG TTT AAA GAT ATT GCA AGT ACA GTG AAC AAG GGT AAG TGG ATT GTG 912 
Lys Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser He Val 
290 295 300 

GGT ACC ACT GCT TCA TTA CAG TAT ATG AAA AAT GTT TTT AAA GAG AAA 960 
Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 310 315 320 

TAT CTC CTA TCT GAA GAT ACA TCT GGA AAA TTT TGG GTA GAT AAA TTA 1008 
Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 
325 330 335 
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AAA TTT GAT AAG TTA TAC AAA ATG TTA ACA GAG ATT TAC ACA GAG GAT 1056 
Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu lie Tyr Thr Glu Asp 
340 34S 350 

5 AAT TTT GTT AAG TTT TTT AAA GTA CTT AAC AGA AAA ACA TAT TTG AAT 1104 

Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 
355 360 365 

TTT GAT AAA GCC GTA TTT AAG ATA AAT ATA GTA CCT AAG GTA AAT TAC 1152 
Phe Asp Lys Ala Val Phe Lys lie Asn lie Val Pro Lys Val Asn Tyr 
10 370 375 380 

ACA ATA TAT GAT GGA TTT AAT TTA AGA AAT ACA AAT TTA GCA GCA AAC 1200 
Thr lie Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 395 400 



TTT AAT GGT CAA AAT ACA GAA AtT AAT AAT ATG AAT TTT ACT AAA CTA 1248 
Phe Asn Gly Gin Asn Thr Glu He Asn Asn Met Asn Phe Thr Lys Leu 

405 410 415 

AAA AAT TTT ACT GGA TTG TTT GAA TTT TAT AAG TTG CTA TGT GTA AGA 1296 
Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 
420 425 430 

GGG ATA ATA ACT TCT AAA ACT AAA TCA TTA GAT AAA GGA TAC AAT AAG 1344 
Gly He He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 445 

AGC GCT GAT GGG GCA TTA AAT GAT TTA TGT ATC AAA GTT AAT AAT TGG 1392 
Ser Ala Asp Gly Ala Leu Asn Asp Leu Cys He Lys Val Asn Asn Trp 
450 455 460 

GAG TTG TTT TTT AGT CCT TCA GAA GAT AAT TTT ACT AAT GAT CTA AAT 1440 
Asp Leu Phe Phe Ser Fro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn 
465 470 475 480 

AAA GGA GAA GAA ATT ACA TCT GAT ACT AAT ATA GAA GCA GCA GAA GAA 1488 
Lys Gly Glu Glu He Thr Ser Asp Thr Asn He Glu Ala Ala Glu Glu 
485 490 495 

AAT ATT AGT TTA GAT TTA ATA CAA CAA TAT TAT TTA ACC TTT AAT TTT 1536 
Asn He Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr Phe Asn Phe 

500 505 510 

GAT AAT GAA CCT GAA AAT ATT TCA ATA GAA AAT CTT TCA AGT 6AC ATT 1584 
Asp Asn Glu Pro Glu Asn He Ser He Glu Asn Leu Ser Ser Asp He 
515 520 525 

40 ATA GGC CAA TTA GAA CTT ATG CCT AAT ATA GAA AGA TTT CCT AAT GGA 1632 

He Gly Gin Leu Glu Leu Met Pro Asn He Glu Arg Phe Pro Asn Gly 

.530 535 540 

AAA AAG TAT GAG TTA GAT AAA TAT ACT ATG TTC CAT TAT CTT CGT GCT 1660 
Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala 
545 550 555 560 



15 



20 



25 



30 



35 



45 



50 



55 



CAA GAA TTT GAA CAT GGT AAA TCT A6G ATT GCT TTA ACA AAT TCT GTT 1728 
Gin Glu Phe Glu His Gly Lys Ser Arg He Ala Leu Thr Asn Ser Val 
565 570 575 

AAC GAA GCA TTA TTA AAT CCT AGT CGT GTT TAT ACA TTT TTT TCT TCA 1776 
Asn Glu Ala Leu Xjeu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser 
580 585 590 

GAC TAT GTA AAG AAA GTT AAT AAA GCT ACG GAG GCA GCT ATG TTT TTA 1624 
Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu 
595 600 505 
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GGC TGG GTA GAA CAA TTA GTA TAT GAT TTT ACC GAT GAA ACT AGC GAA 1872 
Gly Trp Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Qlu Thr Ser Glu 
610 61S 520 

5 GTA AGT ACT ACG GAT AAA ATT GCG GAT ATA ACT ATA ATT ATT CCA TAT 1920 

Val Ser Thr Thr Asp Lys lie Ala Asp He Thr He He lie Pro Tvr 
625 630 635 640 

ATA GGA CCT GCT TTA AAT ATA GGT AAT ATG TTA TAT AAA GAT GAT TTT 1968 
He Gly Pro Ala Leu Asn He Gly Asn Met Leu Tyr Lys Asp Asp Phe 
10 . 645 650 65S 

GTA GGT GCT TTA ATA TTT TCA GGA GCT GTT ATT CTG TTA GAA TTT ATA 2016 
Val Gly Ala Leu He Phe Ser Gly Ala Val He Leu Leu Glu Phe He 
€60 €65 S7Q 

15 CCA GAG ATT GCA ATA CCT GTA TTA GGT ACT TTT GCA CTT GTA TCA TAT 2064 

Pro Glu He Ala He Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr 
675 680 685 

ATT GCG AAT AAG GTT CTA ACC GTT CAA ACA ATA GAT AAT GCT TTA AGT 2112 
He Ala Asn Lys Val l«eu Thr Val gIa Thr He Asp Asn Ala Leu Ser 
20 690 695 70O 

AAA AGA AAT GAA AAA TGG GAT GAG GTC TAT AAA TAT ATA GTA ACA AAT 2160 
Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr He Val Thr Asn 
705 710 715 720 

TGG TTA GCA AAG GTT AAT ACA CAG ATT GAT CTA ATA AGA AAA AAA ATG 2208 
Trp Leu Ala Lys Val Asn Thr Gin He Asp Leu He Arg Lys Lys Met 
725 730 735 

AAA GAA GCT TTA GAA AAT CAA GCA GAA GCA ACA AAG GCT ATA ATA AAC 2256 
Lys Glu Ala Leu Glu Asn Oln Ala Glu Ala Tbr Lys Ala Zle He Asn 
740 745 7S0 

TAT CAG TAT AAT CAA TAT ACT GAG GAA GAG AAA AAT AAT ATT AAT TTT 2304 
Tyr Gin Tyr Asn Gin Tyr Thr Qlu Glu Glu Lys Asn Asn He Asn Phe 
755 760 765 

AAT ATT GAT GAT TTA AGT TCG AAA CTT AAT GAG TCT ATA AAT AAA GCT 2352 
Asn He Asp Asp Leu Ser Ser Lys X#eu Asn Glu Ser He Asn Lys Ala 
35 770 775 780 

ATG ATT AAT ATA AAT AAA TTT TTQ AAT CAA TGC TCT GTT TCA TAT TTA 2400 
Met He Asn He Asn Lys Phe Leu Asn Gin Cya Ser Val Ser Tyr Leu 
785 790 795 BOO 

40 ATO AAT TCT ATG ATC CCT TAT GGT GTT AAA CGG TTA GAA OAT TTT GAT 2448 

Met Asn Ser Met He Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp 
805 810 815 

GCT AGT CTT AAA GAT GCA TTA TTA AAG TAT ATA TAT GAT AAT AGA GGA 2456 
Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr He Tyr Asp Asn Ara Gly 
820 825 830 



25 



30 



45 



50 
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ACT TTA ATT GGT CAA GTA GAT AGA TTA AAA GAT AAA GTT AAT AAT ACA 2544 
Thr Leu He Gly Gin Val Asp Arg Leu Lys Asp Lys Val Asn Asn Thr 
835 840 645 

CTT AGT ACA GAT ATA CCT TTT CAG CTT TCC AAA TAC GTA GAT AAT CAA 2592 
Leu Ser Thr Asp He Pro Phe Gin Leu Ser Lys Tyr Val Asp Asn Gin 
850 855 860 

AGA TTA TTA TCT ACA TTT ACT GAA TAT ATT AAG TAA 2628 
Arg Leu Leu Ser Thr Phe Thr Glu Tyr He Lys * 
865 870 875 
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(2) INFORMATION FOR SEQ ID NO: 10: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 876 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(il) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
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Met Gin Phe Val Asn hys Gin Phe Asn Tyr Lys Asp Pro Val Asn Glv 
1 S 10 IS 

Val Asp lie Ala Tyr He Lys lie Pro Asn Ala Gly Gin Met Gin Pro 

20 25 30 

Val Lys Ala Phe Lys He His Asn Lys lie Trp Val He Pro Glu ArQ 
35 40 45 ^ 

Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
50 55 60 

Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
65 70 75 BO 

Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 
85 90 95 

Arg He Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser He Val 
100 105 110 

Arg Gly He Pro Phe Tzp Gly Gly Ser Thr He Asp Thr Glu Leu Lys 
115 120 125 

Val He Asp Thr Asn Cys He Asn Val He Gin Pro Asp Gly Ser Tyr 
130 135 140 

Arg Ser Glu Glu Leu Asn Leu Val He He Gly Pro Ser Ala Asp He 
145 150 155 160 

He Gin Phe Glu Cys Lys Sex Phe Gly His Glu Val Leu Asn Leu Thr 
1S5 170 175 

Arg Asa Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 
180 185 190 

Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
195 200 205 

Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 
210 215 220 

Leu He His Ala Gly His Arg Leu Tyr Gly He Ala He Asn Pro Asn 
225 230 235 240 

Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
245 250 255 

Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 
260 265 270 

Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
275 280 285 

Lys Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser He Val 
290 295 300 
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Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 310 315 320 

Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys l.eu 
325 330 335 

Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu He Tyr Thr Glu Asp 
340 345 350 

Asn Phe val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 
355 360 365 

Phe Asp Lys Ala Val Phe Lys He Asn He Val Pro Lys Val Asn Tyr 
370 375 380 

Thr He Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 350 395 400 

Phe Asn Gly Gin Asn Thr Glu He Asn Asn Met Asn Phe Thr Lys Leu 
405 410 415 

Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 
420 425 430 

Gly He He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 44C 445 

Ser Ala Asp Gly Ala Leu Asn Asp Leu Cys He Lys Val Asn Asn Trp 
450 455 460 

Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn 
465 470 475 480 

Lys Gly Glu Glu He Thr Ser Asp Thr Asn He Glu Ala Ala Glu Glu 
485 490 495 

Asn He Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr Phe Asn Phe 
500 505 510 

Asp Asn Glu Pro Glu Asn He Ser He Glu Asn Leu Ser Ser Asp He 
515 S20 525 

He Gly Gin I^eu Glu Leu Met Pro Asn He Glu Arg Phe Pro Asn Gly 
530 535 540 

Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala 
545 550 555 560 

Gin 61u Phe Glu His Gly Lys Ser Arg He Ala Leu Thr Asn Ser Val 
565 570 575 

Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe-Phe Ser Ser 
560 585 590 

Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu 
595 600 605 

Gly Trp Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu 
610 615 620 

Val ser Thr Thr Asp Lys He Ala Asp He Thr He He He Pro Tyr 

625 630 635 640 

He Gly Pro Ala Leu Asn He Gly Asn Met Leu Tyr Lys Asp Asp Phe 
645 650 655 



51 



EP0 939 818B1 



Val Gly Ala Leu He Phe Ser Gly Ala Val He Leu Leu Qlu Phe He 
660 665 g7o 

Pro Glu He Ala He Pro Val Leu Gly Thr Phe Ala Leu Val Ser TVr 
675 680 605 ^ 

He Ala Asn Lys Val Leu Thr Val Gin Thr He Asp Asn Ala Leu Ser 
690 695 700 

Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr He Val Thr Asn 
705 710 715 720 

Trp Leu Ala Lys Val Asn Thr Gin He Asp Leu He Arg Lys Lys Met 
725 730 735 

Lys Glu Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala He He Asn 
740 745 750 

Tyr Gin Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn He Asn Phe 

755 760 

Asn He Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He Asn Lvs Ala 
770 775 780 

Met He Asn He Asn Lys Phe Leu Asn Gin Cys Ser Val Ser Tyr Leu 
785 790 795 eOO 

Met Asn Ser Met He Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe Asn 
805 810 815 

Ala Sex Leu Lys Asp Ala Leu Leu Lys Tyr He Tyr Asp Asn Arg Gly 
820 825 830 

Thr Leu He Gly Gin Val Asp Arg Leu Lys Asp Lys Val Asn Asn Thr 
835 840 845 

Leu Ser Thr Asp He Pro Phe Gin Leu Ser Lya Tyr Val Asp Asn Gin 
850 655 8^0 

Arg Leu Leu Ser Thr Phe Thr Glu Tyr He Lys * 
865 870 875 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2637 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE^TYPE: DNA (genomic) 
(Ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1.. 2637 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
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ATG CAG TTC GTG AAC AAO CAG TTC AAC TAT AAG GAC CCT GTA AAC GGT 
Met Gin Phe Val Asn liyB Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 
1 5 10 IS 

GTT GAC ATT GCC TAC ATC AAA ATT CCA AAC GCC GGC CAG ATG CAG CCG 
Val Asp He Ala Tyr lie Lys He Pro Asn Ala Gly Gin Met Gin Pro 
20 25 30 
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GTO AAG GCT TTC AAO ATT CAT AAC AAA ATC TGG GTT ATT CCO GAA CGC ^aj. 
Val Lys Ala Phe Lys lie His Asn Lya He Trp Val lie Pro Glu Arg 
35 40 45 

GAT ACA TTT ACG AAC CCG GAA GAA GGA GAC TTG AAC CCG CCG CCG GAA 
Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 

GCA AAG GAG GTG CCA GTT TCA TAG TAC GAT TCA ACC TAT CTG AGC ACA 
Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
« 70 75 BO 

GAC AAC GAG AAG GAT AAC TAC CTG AAG GGA GTG ACC AAA TTA TTC GAG 
Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 
85 90 95 

CGT ATT TAT TCC ACT GAC CTG GGC CGT ATG CTG CTG ACC TCA ATC GTC 
Arg He Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser He Val 
100 105 HQ 

CGC GGA ATC CCA TTT TGG GGT GGC AGT ACC ATT GAC ACG GAG TTG AAG 3BA 
Arg Gly He Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu Leu Lvs 
20 lis 120 125 

GTT ATT GAC ACT AAC TGC ATT AAC GTG ATC CAA CCA GAC GGT AGC TAC 432 
Val He Asp Thr Asn Cys He Asn Val He Gin Pro Asp Gly Ser Tvr 
130 135 140 ^ r *yr 



10 



15 



25 



30 



35 



40 



192 



240 



288 



336 



AGA TCT GAA GAA CTT AAC CTC GTA ATC ATC GGG CCC TCC GCG GAC ATT 480 
145 150 "^^^ 

ATC CAG TTT GAG TGC AAG AGC TTT GGC CAC GAA GTO TTG AAC CTG ACG S2a 
He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 170 175 

CGT AAC GGT TAC GGC TCT ACT CAG TAC ATT CGT TTC AGC CCA GAC TTC S7fi 
Arg Asn Gly Tyx Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 
180 IBS 190 

ACG TTC GGT TTC GAG GAG AGC CTG GAG GTT GAT ACC AAC CCG CTG TTG 62^ 
Thr Phe Qly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
195 200 205 

GGT GCA GGC AAG TTC GCA ACT GAT CCA GCG GTG ACC CTG GCA CAC GAG fi72 
Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 
210 215 220 

CTG ATC CAC GCC GGT CAT CGT CTG TAT GGC ATT GCG ATT AAC CCG AAC 720 
Leu He His Ala Gly His Arg Leu Tyr GXy He Ala He Asn Pro Asn 
225 230 235 240 

CGC GTG TTC AAG GTT AAC ACC AAC GCC TAC TAC GAG ATG AGT GGT TTA 768 
Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
245 250 255 

GAA GTA AGC TTC GAG GAA CTG CGC ACG TTC GGT GGC CAT GAT GCG AAG 816 
Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Qly Gly His Asp Ala Lys 
260 265 270 

50 TTT ATC GAC AGC TTG CAG GAG AAC GAG TTC CGT CTG TAC TAC TAC AAC 864 

Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tvr Aan 
275 280 28S 

AAG TTT AAA GAT ATT GCA AGT ACA CTG AAC AAG GCT AAG TCC ATT GTG 912 

Lya Phe Lys Asp He Ala Ser Thr Leu Aan Lya Ala Lys Ser He Val 
55 290 295 300 
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GGT ACQ ACT GCT TCA TTA CAG TAT ATG AAA AAT GTT TTT AAA GAG AAA 
Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe hys Glu Lva 
305 310 aas ^ 220 

TAT CTC CTA TCT GAA GAT ACA TCT GGA AAA TTT TCG GTA GAT AAA TTA 
Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lya Phe Ser Val Asp Lys Leu 
32S 330 335 

AAA TTT GAT AAG TTA TAG AAA ATG TTA ACA GAG ATT TAC ACA GAG GAT 
Lya Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu lie Tyr Thr Glu Asp 
340 345 350 

AAT TTT GTT AAG TTT TTT AAA GTA CTT AAC AGA AAA ACA TAT TTG AAT 
Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 
355 360 26S 

TTT GAT AAA GCC GTA TTT AAG ATA AAT ATA GTA CCT AAG GTA AAT TAC 
Phe Asp Lys Ala Val Phe Lya He Asn He Val Pro Lys Val Asn Tyr 
370 375 380 

ACA ATA TAT GAT GGA TTT AAT TTA AGA AAT ACA AAT TTA OCA GCA AAC 
Thr He Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 

385 390 395 400 

TTT AAT GGT CAA AAT ACA GAA ATT AAT AAT ATG AAT TTT ACT AAA CTA 
Phe Asn Gly Gin Asn Thr Glu He Asn Asn Met Asn Phe Thr Lys Leu 

405 410 415 

AAA AAT TTT ACT GGA TTG TTT GAA TTT TAT AAG TTG CTA TGT GTA AGA 
Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 
420 425 430 

GGG ATA ATA ACT TCT AAA ACT AAA TCA TTA GAT AAA GGA TAC AAT AAG 
Gly He He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 445 

ATC GAA GGT CGT TGC GAT GGG GCA TTA AAT GAT TTA TGT ATC AAA GTT 
lie Glu Gly Arg Cys Asp Gly Ala Leu Asn Asp Leu Cys He Lys Val 
450 455 460 

AAT AAT TGG GAC TTG TTT TTT AGT CCT TCA GAA GAT AAT TTT ACT AAT 

Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn 
465 470 475 480 

GAT CTA AAT AAA GGA GAA GAA ATT ACA TCT GAT ACT AAT ATA GAA GCA 
Asp Leu Asn Lys Gly Glu Glu He Thr Ser Asp Thr Asn He Glu Ala 
48S 490 495 

GCA GAA GAA AAT ATT AGT TTA GAT TTA ATA CAA CAA TAT TAT TTA ACC 
Ala Glu Glu Asn He Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr 
500 505 510 

TTT AAT TTT GAT AAT GAA CCT GAA AAT ATT TCA ATA GAA AAT CTT TCA 
Phe Asn Phe Asp Asn Glu Prp Glu Aan He Ser He Glu Aan Leu Ser 
515 520 525 

AGT GAC ATT ATA GGC CAA TTA GAA CTT ATG CCT AAT ATA GAA AGA TTT 
Ser Asp He He Gly Gin Leu Glu Leu Met Pro Asn He Glu Ara Phe 
530 535 540 

CCT AAT GGA AAA AAG TAT GAG TTA GAT AAA TAT ACT ATG TTC CAT TAT 
Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr 
545 550 555 560 

CTT CGT GCT CAA GAA TTT GAA CAT GGT AAA TCT AOG ATT GCT TTA ACA 
Leu Ar9 Ala Gin Glu Phe Glu His Gly Lys Ser Arg He Ala Leu Thr 
565 570 575 
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10 



15 



30 



35 



40 



AAT TCT GTT AAC GAA GCA TTA TTA AAT CCT AGT CGT GTT TAT ACA TTT 
Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr 
580 585 



Phe 



1776 



1824 



1872 



TTT TCT TCA GAC TAT GTA AAG AAA GTT AAT AAA GCT ACQ GAG GCA GCT 
Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala 
595 600 60S 

ATG TTT TTA GGC TGG GTA GAA CAA TTA GTA TAT GAT TTT ACC GAT GAA 

Met Phe Leu Gly Trp Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu 
610 615 620 

ACT AGC GAA GTA AGT ACT ACG GAT AAA ATT GCG GAT ATA ACT ATA ATT 1920 
Thr Ser Glu Val Ser Thr Thr Asp Lys He Ala Asp He Thr He He 
625 630 635 640 

ATT CCA TAT ATA GGA CCT GCT TTA AAT ATA GGT AAT ATG TTA TAT AAA 196fl 
He Pro Tyr He Gly Pro Ala Leu Asn He Gly Asn Met Leu Tyr Lva 
645 650 655 

GAT GAT TTT GTA GGT GCT TTA ATA TTT TCA GGA GCT GTT ATT CTG TTA onic 
Asp Asp Phe Val Gly Ala Leu He Phe Ser Gly Ala Val He Leu Leu 

2^ 660 665 670 

GAA TTT ATA CCA GAG ATT GCA ATA CCT GTA TTA GGT ACT TTT GCA CTT 2afi4 
Glu Phe He Pro Glu He Ala He Pro Val Leu Gly Thr Phe Ala Leu 
675 680 685 

25 GTA TCA TAT ATT GCG AAT AAG GTT CTA ACC GTT CAA ACA ATA GAT AAT 2112 

Val ser Tyr He Ala Aan hye Val Leu Thr Val Gin Thr He Asp Asn 
690 695 700 

GCT TTA AGT AAA AGA AAT GAA AAA TGG GAT GAG GTC TAT AAA TAT ATA 2160 
Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr He 
705 710 715 720 



2208 



GTA ACA AAT TGG TTA GCA AAG GTT AAT ACA CAO ATT GAT CTA ATA AGA 
Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gin He Asp Leu He Arg 
725 730 735 

AAA AAA ATG AAA GAA GCT TTA GAA AAT CAA GCA GAA GCA ACA AAG GCT 2256 
Lys Lys Met Lya Glu Ala Leu Glu Asn Gin Ala Olu Ala Thr Lys Ala 
740 745 750 

ATA ATA AAC TAT CAG TAT AAT CAA TAT ACT GAG GAA GAG AAA AAT AAT 2304 
He He Asn Tyr Gin Tyr Asxi Gin Tyr Thr Glu Glu Glu Lys Asn Asn 
755 760 765 

ATT AAT TTT AAT ATT GAT GAT TTA AGT TCG AAA CTT AAT GAG TCT ATA 2352 
He Asn Phe Asn He Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He 
770 775 780 

AAT AAA GCT ATG ATT AAT ATA AAT AAA TTT TTG AAT CAA TGC TCT GTT 2400 
45 Asn Lys Ala Met He Asn He Asn Lys Phe Leu Asn Gin Cys Ser Val 

78S 790 795 800 

TCA TAT TTA ATG AAT TCT ATG ATC CCT TAT GGT GTT AAA CGG TTA GAA 2448 
Ser Tyr Leu Met Asn ser Met He Pro Tyr Gly Val Lys Arg Leu Glu 
80S BIO 815 

50 - - 

GAT TTT GAT GCT AGT CTT AAA GAT GCA TTA TTA AAG TAT ATA TAT GAT 2496 
Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr He Tyr Asd 
820 825 830 

AAT AGA GGA ACT TTA ATT GGT CAA GTA GAT AGA TTA AAA GAT AAA GTT 2544 
Asn Arg Gly Thr Leu He Gly Gin Val Asp Arg Leu Lys Asp Lys Val 
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AAT AAT ACA CTT AGT ACA GAT ATA CCT TTT CAG CTT TCC AAA TAr r-ra 
Asn Asn Thr Leu Ser Thr Asp lie Pro Phe Gin Leu Ser Lys Tyr Val 
osKJ 855 860 

GAT AAT CAA AGA TTA TTA TCT ACA TTT ACT GAA TAT ATT AAG TAA 
Asp Asn Gin Arg Leu Leu Ser Thr Phe Thr Glu Tyr He Lye ♦ 
865 870 87S 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 879 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: iinear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
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Met Gin Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Glv 
1 5 10 IS ^ 

Val Asp lie Ala Tyr He hys He Pro Asn Ala Gly Gin Met Gin Pro 
20 25 30 

Val Lya Ala Phe Lya He His Asn Lys He Trp Val He Pro Glu Arg 
3 5 4 0 45 

Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
50 55 60 

Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
^5 70 75 QQ 

Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 
85 90 95 

Ar9 He Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser He Val 
100 XOS 110 

Arg Gly He Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu Leu Lys 

115 120 125 

Val He Asp Thr Asn Cys He Asn Val He Gin Pro Asp Gly Ser Tvr 
130 135 . 140 

Arg Ser Glu Glu Leu Asn Leu Val He He Gly Pro Ser Ala Asp He 
i45 ISO 155 ISO 

He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
1^5 170 175 

Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 
180 185 190 

Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
195 200 205 

Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 
210 215 220 

Leu He His Ala Gly Kis Arg Leu Tyr oly He Ala He Asn Pro Asn 
225 230 235 240 

Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
245 250 255 
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Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 
260 265 270 

Phe lie Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
275 280 285 

Lys Phe Lys Asp lie Ala Ser Thr Leu Asn Lys Ala Lys Ser lie Val 
290 295 300 

Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Uyg 
305 310 315 

Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 
325 330 335 

Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu He Tyr Thr Glu Asp 
340 345 350 

Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 

355 360 



365 



Phe Asp Lys Ala Val Phe Lys He Asn He Val Pro Lys Val Asn Tyr 

370 375 380 

Thr He Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 395 

Phe Asn Gly Gin Asn Thr Glu He Asn Asn Met Asn Phe Thr Lys Leu 
405 410 415 

Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Ara 
420 425 430 

Gly He He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 445 

He Glu Gly Arg Cys Asp Gly Ala Leu Asn Asp Leu Cys He Lys Val 
450 455 460 

Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn 

470 475 

Asp Leu Asn Lys Gly Glu Glu He Thr Ser Asp Thr Asn He Glu Ala 
485 490 495 

Ala Glu Glu Asn He Ser Leu Asp Leu He aln Gin Tyr Tyr Leu Thr 
500 505 

Phe Asn Phe Asp Asn Glu Pro Glu Asn He Ser He Glu Asn Leu Ser 
515 520 525 

Ser Asp He He Gly Gin Leu Glu Leu Met Pro Asn He Glu Ara Phe 
530 535 540 

Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr 
545 550 555 

Leu Arg Ala Gin Glu Phe Glu His Gly Lys Ser Arg He Ala Leu Thr 
565 570 575 

Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe 

580 585 590 

Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala 
595 600 fi05 



59 



EP 0 939 818 B1 



Met Phe Leu Gly Trp Val Glu Gin Leu Val Tyr Aap Phe Thr A«p gIu 
Thr Ser Glu Val Ser Thr Thr Asp Lys He Ala Asp He Thr Ue He 
He Pro Tyr lie Gly Pro Ala Leu Asn lie Gly Asn Met Leu Tyr Lys 



S55 



ABp Asp Phe val Gly Ala Leu lie Phe Ser Gly Ala Val He Leu Leu 

665 g7Q 

Glu Phe lie Pro Glu He Ala He Pro Val Leu Gly Thr Phe Ala Leu 

OI3 680 

val ser Tyr He Ala Asn Lys Val Leu Thr Val Gin Thr He Asp Asn 



700 



Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr He 

val Thr Asn Trp Leu Ala Lys Val Asn Thr Gin He Asp Leu He Arg 

730 

Lys Lys Met Lys Glu Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala 



750 



He He Asn Tyr Gin Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn 

" 760 

He Asn Phe Asn He Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He 
' 'V 775 

Asn Lys Ala Met He Asn He Asn Lys Phe Leu Asn Oln Cys Ser Val 

795 ' 800 

Ser Tyr Leu Met Asn Ser Met He Pro Tyr Gly Val Lys Arg Leu Glu 
aos 810 

Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr He TVr Asp 

820 825 — - * ^ 



830 



Asn Arg Gly Thr Leu He Gly Gin Val Asp Arg Leu Lys Asp Lys Val 

640 — — 



845 

Asn Asn Thr Leu ser Thr Asp He Pro Phe Gin Leu Ser Lys Tyr Val 

*'^»' 855 860 

Asp Aan Gin Arg Leu Leu Ser Thr Phe Thr Olu Tyx lie Lys ♦ 
865 870 

(2) INFORMATION FOR SEQ ID NO: 13: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2862 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 

(A) NAME/KEY: CDS 
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(B) LOCAT10N:1..2862 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



10 



15 



20 



25 



30 



35 



40 



45 



50 
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20 



25 



30 



45 



ATG CAG TTC GTG AAC AAG CAG TTC AAC TAT AAG GAC CCT GTA AAC GGT 4fl 
Met Gin Phe Val Asn Lys Gin Phe Aan Tyr Lys Asp Pro Val Asn Glv 
1 5 10 ' 



15 



5 GTT GAC ATT GCC TAC ATC AAA ATT CCA AAC GCC GGC CAG ATG CAG CCG 

Val Asp He Ala Tyr He Lys He Prp Asn Ala Gly Gin Mec Gin Pro 
20 25 30 

GTG AAG GOT TTC AAG ATT CAT AAC AAA ATC TGG GTT ATT CCG GAA CGC 
Val Lys Ala Phe Lys lie His Asn Lys He Trp Val He Pro Glu Arq 
10 35 40 45 



GGT GCA GGC AAG TTC GCA ACT GAT CCA 6CG GTG ACC CTG GCA CAC GAG 
Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 
210 215 220 



96 



144 



GAT ACA TTT ACG AAC CCG GAA GAA GGA GAC TTG AAC CCG CCG CCG GAA loo 
Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
50 55 60 

^5 GCA AAG CAG GTG CCA GTT TCA TAC TAC GAT TCA ACC TAT CTG AGC ACA 240 

Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
^5 70 75 gQ 

GAC AAC GAG AAG GAT AAC TAC CTG AAG GGA GTG ACC AAA TTA TTC GAG 2afl 
Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 
85 90 . 95 

CGT ATT TAT TCC ACT GAC CTG GGC CGT ATG CTG CTG ACC TCA ATC GTC iifi 
Arg He Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser Xle Val 
100 105 110 

CGC GGA ATC CCA TTT TGG GGT GGC AGT ACC ATT GAC ACG GAG TTG AAG 384 
Arg Gly He Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu Leu Lys 
115 120 125 

GTT ATT GAC ACT AAC TGC ATT AAC GTG ATC CAA CCA GAC GOT AGC TAC 432 
Val He Asp Thr Asn Cys He Asn Val He Gin Pro Asp Gly Ser Tyr 
130 135 140 

AGA TCT GAA GAA CTT AAC CTC GTA ATC ATC GGO CCC TCC GCG GAC ATT 4flft 
Arg Ser Glu Glu Leu Asn Leu Val He He Gly Pro Ser Ala Asp He 
145 150 ISS *' 3L60 

ATC CAG TTT GAG TGC AAG AGC TTT GGC CAC GAA GTG TTG AAC CTG ACG 523 
He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
35 ItfS 170 175 

CGT AAC GGT TAC GGC TCT ACT CAO TAC ATT CGT TTC AGC CCA GAC TTC 576 
Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 
IBO 185 190 

40 ACG TTC GGT TTC GAG GAG AGC CTG GAG GTT GAT ACC AAC CCG CTG T1X3 624 

Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro I*eu Leu 
195 200 205 



6 72 



55 



CTG ATC CAC GCC GGT CAT CGT CTG TAT GGC ATT GCG ATT AAC CCG AAC 720 
Leu He His Ala Gly His Arg Leu Tyr Gly He Ala He Asn Pro Asn 
225 230 235 240 

CGC GTG TTC AAG GTT AAC ACC AAC GCC TAC TAC GAG ATG AGT GGT TTA 768 
Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
245 250 2SS 

GAA GTA AGC TTC GAG GAA CTG CGC ACG TTC GGT GGC CAT GAT GCG AAG 816 
Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 
260 265 270 
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10 



15 



20 



25 



35 



45 



50 



TTT ATC GAC AGC TTG CAG GAG AAC GAG TTC CGT CTG TAC TAC TAG AAC 
Phe lie Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu ryr Tyr Tyr Asn 

AAG TTT AAA GAT ATT GCA AGT ACA CTG AAC AAG GCT AAG TCC ATT CTG 
Lys Phe Lys Asp He AXa Ser Thr Lea Asn Lya Aia Lys Ser l" vl? 
290 295 

GGT ACC ACT GCT TCA TTA CAG TAT ATG AAA AAT QTT TTT AAA GAG AAA 
Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu 2S 
310 3X5 

TAT CTC CTA TCT GAA GAT ACA TCT GGA AAA TTT TCG GTA GAT AAA TTA 
Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lya Leu 
325 330 335 

AAA TTT GAT AAG TTA TAC AAA ATC TTA ACA GAG ATT TAC ACA GAG GAT 
Lys Pb« Asp Lya Leu lyr Lys Met Leu Thr Glu He Tyr Thr Glu Asp 
340 34S 

AAT TTT GTT AAG TTT TTT AAA GTA CTT AAC AGA AAA ACA TAT TTG AAT 
Asa Phe val Lys Phe Phe Lys Val Leu Aen Ar« ll?? ?S 5^ £^ 

355 360 

TTT GAT AAA GCC GTA TTT AAG ATA AAT ATA GTA OCT AAG GTA AAT TAC 
Phe Asp Lys Ala Val Phe Lys He Asn He Val Pro Lys Val Asn ^ 

370 375 380 

ACA ATA TAT GAT GGA TTT AAT TTA AGA AAT ACA AAT TTA GCA GCA AAC 
Thr He Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
390 395 

TTT AAT GOT CAA AAT ACA GAA ATT AAT AAT ATG AAT TTT ACT AAA CTA 
Phe Asn Gly Gin Asn Thr Glu He Asn Asn Met SSn pS L^ S« 
405 410 42,5 

AAA AAT TTT ACT GGA TTG TTT GAA TTT TAT AAG TTG CTA TOT GTA AGA 
Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Val A^ 

GGG ATA ATA ACT TCT AAA ACT AAA TCA TTA GAT AAA GGA TAC AAT AAG 
Gly He He Thr Ser Lys Thr Lys Ser Leu Asp ^ Gly ^s 
435 440 445 

ATC GAA GGT CGT TGC GAT GGG GCA TTA AAT GAT TTA TGT ATC AAA GTT 
He Glu Gly Arg Cys Asp Gly Ala Leu Asn Asp Leu Cys He Lys Val 
450 455 450 

AAT AAT TGO GAC TTG TTT TTT AGT CCT TCA GAA GAT AAT TTT ACT AAT 
Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn 
465 470 475 

GAT CTA AAT AAA GGA GAA GAA ATT ACA TCT GAT ACT AAT ATA GAA GCA 
Asp Leu Asn Lya Gly Glu Glu He Thr Ser Asp Thr Asn He Glu Ala 
48S 490 49S 

GCA GAA GAA AAT ATT AGT TTA GAT TTA ATA CAA CAA TAT TAT TIA ACC 
Ala Glu Glu Asn He Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr 
500 505 sxo 

TTT AAT TTT GAT AAT GAA CCT GAA AAT ATT TCA ATA GAA AAT CTT TCA 
Phe Asn Phe Asp Asn Glu Pro Glu Asn He Ser He Glu Asn Leu Ser 
Sis 520 525 

AGT GAC ATT ATA GGC CAA TTA GAA CTT ATG CCT AAT ATA GAA AGA TTT 
Ser Asp He He Gly Gin Leu Glu Leu Met Pro Asn He Glu Ara Phe 
530 535 540 



864 



912 



960 



1006 



1056 



X104 



1152 



1200 



1248 



1296 



1344 



1392 



1440 



1488 



1536 



1584 



1632 
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CCT AAT GC5A AAA AAG TAT GAG TTA GAT AAA TAT ACT ATG TTC CAT TAT 
Pro Asn Gly Lys Lys Tyx Glu Leu Asp Lys Tyr Thr Met Phe His Tyr 

545 550 555 560 

CTT CGT GOT CAA GAA TTT GAA CAT GGT AAA TCT AGG ATT GCT TTA ACA 
Leu Arg Ala Gin Glu Phe Glu His Gly Lys Ser Arg lie Ala Leu Thr 
565 570 S75 

AAT TCT GTT AAC GAA GCA TTA TTA AAT CCT AGT CGT GTT TAT ACA TTT 
Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe 
580 585 590 

TTT TCT TCA GAC TAT GTA AAG AAA GTT AAT AAA GCT ACG GAG GCA GCT 
Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala 
595 600 60S 

ATG TTT TTA GGC TGG GTA GAA CAA TTA GTA TAT GAT TTT ACC GAT GAA 
Met Phe Leu Gly Trp Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu 
610 615 620 

ACT AGC GAA GTA AGT ACT ACG GAT AAA ATT GCO GAT ATA ACT ATA ATT 
Thr Ser Glu Val Ser Thr Thr Asp Lys lie Ala Asp lie Thr He lie 
625 630 635 540 

ATT CCA TAT ATA GGA CCT GCT TTA AAT ATA GGT AAT ATG TTA TAT AAA 
He Pro Tyr He Gly Pro Ala Leu Asn lie Gly Asn Met Leu Tyr Lys 
645 650 655 

GAT GAT TTT GTA GGT GCT TTA ATA TTT TCA GGA GCT GTT ATT CTG TTA 
Asp Asp Phe Val Gly Ala Leu He Phe Ser Gly Ala Val He Leu Leu 
660 665 670 

GAA TTT ATA CCA GAG ATT GCA ATA CCT GTA TTA GGT ACT TTT GCA CTT 
Glu Phe He Pro Glu He Ala He Pro Val Leu Gly Thr Phe Ala Leu 
675 680 685 

GTA TCA TAT ATT- GCG AAT AAG GTT CTA ACC GTT CAA ACA ATA GAT AAT 
Val Ser Tyr He Ala Asn Lys Val Leu Thr Val Gin Thr He Asp Asn 
690 695 700 

GCT TTA AGT AAA AGA AAT GAA AAA TGG GAT GAG GTC TAT AAA TAT ATA 
Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr He 
705 710 715 720 

GTA ACA AAT TGG TTA GCA AAG GTT AAT ACA GAG ATT GAT CTA ATA AGA 
Val Thr Asn Trp Leu Ala Lys Val Asn Thr Qln He Asp Leu He Aro 
725 730 735 

AAA AAA ATG AAA GAA GCT TTA GAA AAT CAA GCA GAA GCA ACA AAG GCT 
Lys Lys Met Lys Glu Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala 
740 745 750 

ATA ATA AAC TAT CAG TAT AAT CAA TAT ACT GAG GAA GAG AAA AAT AAT 
He He Asn Tyr Gin Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn 
755 760 765 

ATT AAT TTT AAT ATT GAT GAT TTA AGT TCO AAA CTT AAT GAG TCT ATA 
He Asn Phe Asn He Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He 
770 775 780 

AAT AAA GCT ATG ATT AAT ATA AAT AAA TTT TTG AAT CAA TGC TCT GTT 
Asn Lys Ala Met He Asn He Asn Lya Phe Leu Asn Gin Cys Ser Val 
785 790 795 800 

TCA TAT TTA ATG AAT TCT ATG ATC CCT TAT GGT GTT AAA CGG TTA GAA 
Ser Tyr Leu Met Asn Ser Met He Pro Tyr Gly Val Lys Arg Leu Glu 
805 810 815 
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10 



GAT TTT GAT GCT AGT CTT AAA GAT GCA TTA TTA AAG TAT ATA TAT GAT 2496 
Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr He Tyr Aso 
620 825 930 ^ 

AAT AGA GGA ACT TTA ATT GGT CAA GTA GAT AGA TTA AAA GAT AAA GTT 2S44 
Asn Arg Gly Thr Leu He Gly Gin Val Asp Arg Leu Lys Asp Lys Val 
B35 840 845 

AAT AAT ACA CTT AGT ACA GAT ATA CCT TTT CAG CTT TCC AAA TAC GTA 2 592 

Asn Asn Thr Leu Ser Thr Asp He Pro Phe Gin Leu Ser Lys Tyr Val 
850 855 860 

GAT AAT CAA AGA TTA TTA TCT ACA TTT ACT GAA TAT ATT AAG TCT AGG 2640 
Asp Asn Gin Arg Leu Leu Ser Thr Phe Thr Glu Tyr He Lys Ser Arg 
ess 870 87S 880 

CCT GGA CCG GAG ACG CTC TGC GGQ GCT GAG CTG GIG GAT GCT CTT CAG 2688 
Pro Gly Pro Glu Thr Leu Cys Gly Ala Glu Leu Val Asp Ala Leu Gin 
885 890 895 

TTC GTG TGT GGA GAC AGG GGC ITT TAT TTC AAC AAG CCC ACA GGG TAT 2736 
20 Phe Val Cys Gly Asp Arg Gly Phe Tyr Phe Asn Lys Pro Thr Glv Tvr 
900 90S 910 

GGC TCC AGC ACT CGG AGG GCG CCT CAG ACA GGT ATC GTG GAT GAG TGC 2784 
Gly Ser Ser Ser Arg Arg Ala Pro Gin Thr Gly He Val Asp Glu Cvs 
915 920 925 



25 



TGC TTC CGG AGC TGT GAT CTA AGO AGG CTG GAG ATG TAT TGC GCA CCC 2832 
Cys Phe Arg Ser Cys Asp Leu Arg Arg Leu Glu Met Tyr Cys Ala Pro 
930 935 940 



CTC AAG CCT GCC AAG TCA GCT GAA GCT TAG 
Leu Lys Pro Ala Lys Ser Ala Glu Ala * 
30 945 950 



2862 



(2) INFORMATfON FOR SEQ ID NO: 14: 
35 (I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 954 ammo acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

40 

(ir) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



50 



55 
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Mec Gin Phe Val Asn hya Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 
1 5 10 15 

Val Asp He Ala Tyr He Lys lie Pro Asn Ala Gly Gin Met: Gin Pro 
20 25 30 

Val Lys Ala Phe Lys He His Asn Lys He Trp Val He Pro Glu Arg 
35 40 43 

Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
50 SS 60 

Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
fiS 70 75 80 

Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 
85 90 95 

Arg He Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser He Val 
lOQ 105 110 
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Arg Gly lie Pro Phe Trp Gly Gly ser Thr zie Asp Thr Giu Ly. 

125 



Val lie ASP Thr Asn Cys lie Asn Val He Gin Pro Asp Gly Ser Tyr 

* iJS 2.40 

Arg ser Glu Glu Leu Aan Leu Val He He Gly Pro Ser Ala Aap He 
He Gin Phe Glu Cys Lys Ser Fhe Gly His Glu Val Leu Asn Leu Thr 



175 



Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 



190 



Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
Gly Ma Gly Ly. Phe Ala Thr Asp Pro Ala Val Thr Leu Al. His Glu 

-^-15 220 

Leu He His Ala Gly His Arg Leu Tyr Gly He Ala He Asn Pro Asn 

Arg val Phe Ly, Val Asn Thr Asa Ala Tyr Tyr Glu Met Ser Gly Leu 

250 255 
Glu val ser Phe Glu Glu Leu Arg Thr Ph. Gly Gly His Asp Ala Ly. 

265 270 

Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 



2S5 



Lys Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Ly. Smr He Val 

300 

Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 

3X5 

Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp. Ly. Leu 

330 

Lys Phe Asp Lys Leu Tyr Ly. Met Leu Thr Glu He Tyr Thr Glu Asp 

345 *^ 

Asn Phe val Lys Phe Fbe Lys Val Leu Asn Arg Ly. Thr Tyr Leu Asn 



365 



Phe Asp Lys Ala Val Phe Lys He Asn He Val ^ Lys Val Asn Tyr 

Thr He Tyr Asp Gly Ph. Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 

39S 

Phe Asn Gly Gin Asn Thr Glu He Asn Asn Met Asn Phe Thr Lys Leu 
405 410 

Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 

*3U 425 

Gly lie lie Thr Ser Lys Tlir Lys Ser Leu Asp Lys Gly Tyr Asn Lys 



445 



He Glu Gly Arg Cys Asp Gly Ala Leu Asn Asp Leu Cys lie Lys Val 



4eO 
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Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Aan 
4SS 470 475 430 

Asp Leu Asn Lys Gly Glu Glu lie Thr Ser Asp Thr Asn lie Glu Ala 
485 490 495 

Ala Glu Glu Asn lie Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr 
SOO 505 sio 

Phe Asn Phe Asp Asn Glu Pro Glu Aan lie Ser lie Glu Asn Leu Ser 
515 520 S25 

Ser Asp He lie Gly Gin Leu Glu Leu Met Pro Asn He Glu Aro Phe 
530 535 540 

Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr 
545 550 555 560 

Leu Arg Ala Gin Glu Phe Glu His Gly Lys Ser Arg He Ala Leu Thr 
565 570 S75 

Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe 
580 585 590 

Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala 
595 . 600 £05 

Met Phe Leu Gly Trp Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu 
610 615 620 

Thr Ser Glu Val Ser Thr Thr Asp Lys He Ala Asp He Thr He He 
^25 630 635 640 

He Pro Tyr He Gly Pro Ala Leu Asn He Gly Asn Met Leu Tyr Lys 
64S 650 655 

Asp Asp Phe Val Gly Ala Leu He Phe Ser Gly Ala Val He Leu Leu 
660 665 670 

Glu Phe He Pro Glu He Ala He Pro Val Leu Gly Thr Phe Ala Leu 
675 680 685 

Val Ser Tyr He Ala Asn Lys Val Leu Thr Val Gin Thr He Asp Asn 
690 695 700 

Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr He 
705 710 715 720 

Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gin He Asp Leu lie Arg 

725 730 735 

Lys Lys Mec Lys Glu Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala 
740 745 750 

He He Asn Tyr Gin Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asa 
755 760 765 

He Asn Phe Asn He Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He 
770 775 780 

Asn Lys Ala Met He Asn He Asn Lys Phe Leu Asn Gin cys Ser Val 
785 790 795 800 

Ser Tyr Leu Met Asn Ser Met He Pro Tyr Gly Val Lys Arg Leu Glu 
805 810 815 
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Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr lie Tyr Asp 
820 825 830 

Asn Arg Gly Thr Leu He Gly Gin Val Asp Arg Leu Lys Asp Lys Val 
835 840 845 

Asn Asn Thr Leu Ser Thr Asp He Pro Phe Gin Leu Ser Lys Tyr Val 
850 855 B€0 

Asp Asn Gin Arg Leu Leu Ser Thr Phe Thr Glu Tyr lie Lys Ser Arg 
865 870 875 880 

Pro Gly Pro Glu Thr Leu Cys Gly Ala Glu Leu val Asp Ala Leu Gin 
885 890 895 

Phe Val Cys Gly Asp Arg Gly Phe Tyr Phe Asn Lys Pro Thr Gly Tyr 
900 90S 910 

Gly Ser Ser Ser Arg Arg Ala Pro Gin Thr Gly He Val Asp Glu Cys 
915 920 925 

Cys Phe Arg Ser Cys Asp Leu Arg Arg Leu Glu Met Tyr Cys Ala Pro 
930 935 940 

Leu Lys Pro Ala Lys Ser Ala Glu Ala * 
945 950 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2724 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION:!. .2724 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
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ATG CAG TTC GTG AAC AAG CAG TTC AAC TAT AAG GAC CCT GTA AAC GGT 
Met Gin Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 
15 10 15 

GTT GAC ATT GCC TAC ATC AAA ATT CCA AAC GCC GGC CAG ATG CAG CCG 
Val Asp lie Ala Tyr lie Lys lie Pro Asn Ala Gly Gin Met Gin Pro 
20 25 30 

GTG AAG GCT TTC AAG ATT CAT AAC AAA ATC TGG GTT ATT CCG GAA CGC 
Val Lys Ala Phe Lys He His Asn Lys He Trp Val He Pro Glu Arg 
35 40 45 

GAT ACA TTT ACQ AAC CCG GAA GAA GGA GAC TTG AAC CCG CCG CCG GAA 
Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
50 55 60 

GCA AAG CAG GTG CCA GTT TCA TAC TAC GAT TCA ACC TAT CTG AGC ACA 
Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
65 70 75 80 

GAC AAC GAG AAG GAT AAC TAC CTG AAG GGA GTG ACC AAA TTA TTC GAG 
Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly val Thr Lys Leu Phe Glu 
85 90 95 
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15 



20 



25 



35 



40 



50 



624 



€72 



S^U^im ?S SI 2S m ^ SJ? S 2S ?f I" 

100 ^ VZf Thr Ser lie Val 

•^"^ 110 

s ?i| ^ K 15 s| - s SI II ss 

SI S SI SS ?S S SS f K 5 25 2S ?f 

i Sfi S .3 S Sf S S ^ S jcc ocj „c j„ 

160 

?S SI ^ SSI SI IK 2» "° 

175 

CGT AAC GGT TAC GGC TCT ACT rafl Tur^ . 

Arg Asn Gly Tyr Gly Ser ?hJ ST*^ ^''^ ^« 

180 ?jj ^-^^ ^ **he Ser Pro Asp Phe 

ACG TTC GGT TTC GAG GAG ACr rrr r*»/^ ^m«, 

™. «„ as SI S SI S!I SI Jg ^ SI S 

205 

s 0^ SI s ?s s; SI s? ^ s IS SI SI 

SSSISIiSSII^S^^SIsSsSISSI 
12 Kl SI SI 511 SI SI JJI S SI iS? „^ S 

2S5 

IKSSSISIIKSSIigillls-SISJISSI 
SIJfl.^5IIS|glSI|^SISIS2l^5-SI 

285 

^sjsisjns^jsssij^ifi-jsji:- - 

300 

;1 SI s s s; s SI SI SI ;ii s i?i SI »• 

TAT etc CTA TCT GAA GAT ACA TCT GGA AAA TTT Trr r-ra 
Tyr Leu Leu ser Glu Asp Thr Ser ^ 5^ 11^ V.l ^ 

330 335 

AAA TTT GAT AAG TTA TAC AAA ATG TTA ACA rar ath- . 
LX. Phe A.P Txr Lys M.? ^ S $S 

350 

^^n Se ^ JSI ^ 5It ™ AAT 

355 ^ 3ffi i^y* Thr Tyr Leu Asn 



3 84 



432 



480 



526 



576 



768 



816 



1104 



55 355 3^0 3„ 
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10 



15 



20 



25 



35 



40 



45 



50 



s s Si s s; SI IS SI IS s 

570 

Jil 12 SI in Si Si K 12 KI gl ^ S JS S 



590 



1152 



1248 
1296 
1344 
1392 



S S S SI K S S2 S S iil 

ACA ATA TAT GAT GGA TTT AAT rra ar-K 

».p s iS, Hi iS j;; ?2 jji s ..co 

TTT AAT GGT CAA AAT ACA GAA ATT AAT AAT ATr ^r-rvt. 
Phe «an Gly ai„ oi« Jj^ ^ 

410 4j^5 

AAA AAT TTT ACT GGA TTG TTT GAA TTT tut aar. t... - 

Asn Phe Thr ai. X-u Pha ^ |I ^ f^* ^ -A AG. 

GGG ATA ATA ACT TCT AAA ACT AAA TCA TTA GAT aaa ru^^ 
Gly lie XX. Thr Ser ... xhr 2^ ^ 

ATC GAA GGT CGT TGC GAT GGG GCA TTA AAT naT n^n 

lie Glu Gly Ara Cva Aan aw ai * t * -^^^ AAA GTT 

^^u ^xy Arg cys Aep Gly Ala Leu Asn Asp Leu Cys lie Lya Val 

51 iil S 2S IS IS S iS S IS Si 5 ISI S J£ 21 

^''S 480 

GAT CTA AAT AAA GGA GAA GAA ATT ACA TPT ra-r nr-r nn^ « 
A.P .eu A.« l.ya Giy Oi. Glu ^Te S ?S 

490 4^5 

SSiSigISIi2Si2I21SiSiS;?iI?^'IIiJ£ 

505 5io 

S iil SI 21 £1 Si S2 =^ ill K IS S «^ iil S 12 

52 i S ?Ii S Si S SI? £1 iil S Si is S= 

540 

S Jil - S - - |g j5 j.^ 5JC cjj 



1584 



1632 



1728 



1776 



TTT TCT TCA GAC TAT GTA AAG AAA GTT AAT AAA GCT A«2 na« rsr.* r.,^ 
Phe ser Ser Asp Tyr Val Lys Ly. Val iJ^n SJI «2 ^ SI 

600 g05 

jnsiniisssssisisisj^-sissjsi 

f iS ?Ji S S ?S ?S SI 5i S S 5 Si ?£ 5i S 
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10 



15 



25 



40 



45 



50 



s ^ s s s s 1^ - - ™ j« «. 

645 '^sn Met Leu Tyr Lys 

GAT GAT TTT GTA GGT GCT TTA ATA ttt t^j. r,^. 

«P «.p Ph. V.J „. 2; «; SI J2 S SI SI ?n - 

670 



GAA TTT ATA CCA GAG ATT GCA ATA CCT rva ^« 

am ... ,„ „. s S 52 S 25 S E SI 

GTA TCA TAT ATT GCS AAT AAG GTT CTA ACC rrT- r... 
val S« lyr He AI. Asn Ly» Val 2^ ^ S! G^S S 



1968 



2064 



2352 
2400 



GCT TTA ACT AAA AGA AAT GAA AAA Tt3G rar nt^r. 

M. .-u s.. .y. ^ !^ S? 55 ™j «; „j 

720 

GTA ACA AAT TGG TTA GCA AUr r-t-r bkf*. 

T., „„ „^ 2S SI SI S §S S 2? S S is 

735 

AAA AAA ATG AAA GAA GCT TTA GAA AAT ran 

ty3 Lye Met Lys Glu Ma Leu Glu ^ f GCA ACA AAG GCT 
740 ^''^ <3Iu Ala Thr Lys Ala 

750 

ATA ATA AAC TAT CAO TAT AAT CAA i-at Kr^ 

rx. „. »^ ^ ^ 21 ^ ^ S i« - IK ^ Jil j!S 

765 

ATT AAT TTT AAT ATT GAT GAT TTA AGT Tna aab ^ . 

Aan Phe Aan lie Asp Asp i^u ser l^r ®^ Ata 

770 *^ 77| i-ya Leu Asn Glu Ser IZe 

AAT AAA GCT ATG ATT AAT ATA a^t m~ 

A.n I.y. Al. Met xU jt^^ i^n ^ Jhl ^ ^ 

78S 79Q i^eu Asn Gin Cys Ser Val 

'^^ 800 

TCA TAT TTA ATG AAT TCT ATO ATC rr-r 

ser Tyr Leu Nec Asn J« «et ?[e p" gj §?J 5" ^GG TTA GAA 

805 iif Val Lys Arg Leu Glu . 

815 

^ SI S^J SI ^ "A A- TAT ATA TAX OAT 

820 ^ MS Tyr He lyr Asp 

i^IiS^J S SSS 21^ r 

83S ^ 840 ^ VaX 

AAT AAT ACA CTT AOT ACA GAT ATA rrr> tnil.. 

Asn Aan Thr teu sL Se pS ^ ^ ^AC GTA 

850 ^ Ola Leu ser Lys Tyr V*l 

860 

GAT AAT CAA AGA TTA TTA TCT ACA TTT ACT pah »~ 

ASP Asn ain Arg Leu Leu Ser «>r SI fS S J^S 

^■^5 gao 

CCT CAA TCT AAA GTT AAA AGA CAA ATA TTT 

Pro Gin ser Lys Val Lys Arg Gin 1 1 e SI ^ I^^ GAT 

885 w xxe me Ser Oly Tyr Qln Ser Asp 

ATT GAT ACA CAT AAT AGA ATT AAG GAT GAA TTn tv.« 
He Asp Thr His Asn Arg He Lys ^ ^ 

55 505 



2160 
2208 
2256 
2304 



2496 
2S44 
2592 



2688 
2724 
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(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 908 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



74 



EP 0 939 818 B1 



Met cm P.e V.a A.n .ys ain P.e .sn ryr Uys ..p Vai .1. 

val .sp lie Ma 11. xie Pro Aan Al. ci, cin Me. Z Pro 

VaX Ma Phe .y. XI. Hi. A.n I.,, xie Xrp vaX xie Z clu Ar. 

45 

Asp Thr Phe Thr Asn Pro Glu Glu Glv Asn t-., i. 

50 55 '^SP i-eu Aan Pro Pro Pro Glu 



Thr 

80 



Ala I.,. Gin val Pro Val s.r xyr ,Vr Asp ser Thr l.u Ser 

A.P Asn CX« x.y. ASP A.n ryr .eu Cly Val Thr Lys L.u Phe Glu 

30 95 
Arg He Tyr Ser Thr Asp Leu Glv ato m.i- , 

100 ^ ^ ?ff Thr Ser He Val 

Arg Gly He Pro Phe Trp Gly Gly Ser Th^ n-. ^ 

115 ^ ^ i-ys 

Val lie Asp Thr Asn Cys He Asn Val m« » 

130 ^ ® f" -^«P Gly Ser Tyr 

140 

Arg Ser Glu Glu Leu Aan Leu Val il« ti* r-i « 

"5 ^'^'^ ""^^ Gly Pro Ser Ala Asp He 

" 160 
lie Gin Ph. Glu Cys Ly. ser Ph. Gly HI. oiu Val l.« A.n Leu Thr 



X7S 



Arg A,„ Gly ryr Gly Ser Thr Gin Tyr zi. Arg Phe S.r Pro A.p Phe 

190 

Thr Ph. Gly Phe Glu Glu s.r ..u Glu v.l Asp Thr Asn Pro x.u r.u 

205 

Gly Ala Gly x.y, ph. Ala Thr Asp Pro Ala Val Thr Leu Ala „i. gIu 



220 



.eu II. His Ala Gly Hi. Arg ..u lyr Gly il. Ala Xl. Asn Pro Asn 

240 

Arg val Phe .ys Val Asn Thr Asn Ala Tyr Tyr Glu Me. s.r Gly .eu 



255 



Glu val ser Ph. Glu Glu Leu Arg x^r Ph. Gly Gly „i. Asp Ala X.ys 



270 



Phe He Asp Ser Leu Gin Glu Asn Glu Ph. « 

275 280 ^ ^ Asn 



285 



Lys Phe Lys Asp He Ala s.r Thr Leu Asn Lys Ala Lys Ser XI. Val 



75 
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Gly Thr Thr Ala Ser heu Gin Tvr M*.^ rt,. , 

305 ''^^ Lys Asn Val Phe Lya Glu Lys 

^ , 320 
ryr Leu Leu Sex Glu Aap Thr Ser Gl. jy, Phe Ser Val Asp .,3 Leu 

Lys Phe A3P Lys Leu Tyr Lys Met Leu Thr OXu XXe Tyr Thr Z ;up 

-^^5 350 
Asn Phe val Lys Phe Phe Lys Val Leu Asn Arg Ly, Thr Tyr Leu Aa„ 

Phe ASP Lys Ala Val Phe Lys He Asn Xle Val Pro Ly! val Asn Tyr 

Thr He Tyr Asp Gly Phe Asn Leu Arg Asn Z Leu Al. Ala Asn 

400 

Ph. Asn Gly Gin Asn Thr Glu He Asn as„ „e. a.„ Phe Thr Lys Leu 

Ly. Asn Phe 0^ Gly Leu Phe Glu Phe Tyr Ly. Leu Leu Cy. val Arg 

430 

Gly lie He Thr Ser I*ys Thr Lva Sm^ » 

435 ^ ^•'^ Gly Tyr Asn Lya 

445 

ne Glu Gly Arg Cy. Asp Gly Ala Leu Asn Asp Leu Cy, lie Lya Val 
Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn 

480 

ASP Leu Asn Lys Gly Glu Glu lie Thr ser Asp Thr Asn He Glu Ala 
Al. Glu Glu Asn Xle Ser Leu Asp Leu Xle Gin Gin xyr Tyr Leu Thr 

Phe Asn Phe Asp Asn Glu Pro Glu Asn llm sm^ ri^ » 

515 "® <3iu Asn Leu Ser 

525 

ser ASP Xle Xle Gly Gin Leu Glu Leu M.. Pro Asn Xle Glu Arg Phe 

540 

Pro Asn Gly Lys Lys Tyx Glu Leu Asp Lys Tyr Thr Het Ph. His Tyr 

I*u Arg Ala Gin Glu Ph. Glu Hi. aiy Lys Ser Arg He Ala Leu Thr 

^'^^ 575 
Asn ser Val Asn Glu Ala Leu Leu A.„ p.o ser Arg Val Tyr Thr Phe 

Phe ser Ser Asp Tyr Val Lys Ly. Val Asn Lys Al. Thr Glu Ala Ala 

Met Phe Leu Gly Trp Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu 

^ 620 
Thr ser Glu Val Ser Thr Thr Asp Lys xie Ala Asp Xle 1^ xie XI. 

535 

Xle Pro Tyr Xle Gly Pro Ala Leu Asn lie Gly Asn Het Leu Tyr Lys 



«so ess 
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ASP ASP Phe Val Cly Ala Leu He Phe Ser Gly Ala Val lie Leu Leu 

670 

Glu Phe tie Pro Glu rie Ala lie Pro Val Leu Gly Thr Phe Al 



6B5 



La Leu 



val Ser Tyr He Ala Asn Lys Val Leu Thr Val Gl„ Thr He Asp Asn 

Ala Leu ser Lys Arg Asn oiu Lys Trp Asp Glu Val xyr Lys Tyr He 

720 

Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gin He Asp Leu lie Ara 

'^^ 730 

Lys Lys Met Lys Glu Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala 
lie lie Asn Tyr Qln Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn 

760 

lie Asn Phe Asn He Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He 

780 

Asn Lys Ala Met He Asn He Asn Ly. Ph. Leu Asn Gin Cys ser Val 

aoo 



785 790 

ser Tyr Leu Met Asn Ser Met rie Pro Tyr Gly Val Lys Arg Leu Glu 

815 



Asp Phe ASP Ala ser Leu Lys Asp Ala Leu Leu Lys Tyr He Tyr Asp 

Asn Arg Gly Thr Leu He Gly ain V.l Asp Arg Leu Ly. Asp z,ys Val 

845 

Asn Asn Thr Leu Ser Thr Asp He Pro Phe Gin Leu Ser Lys Tyr Val 

ABp Asn Gin Arg Leu Leu Ser Thr Phe Thr Glu Tyr He Lys Ser Arg 

aeo 

Pro Gin ser Lys Val Lys Arg Gin He Phe Ser Gly Tyr Gin Ser Asp 



He Asp Thr His Asn Arg He Lys Asp Glu Leu 
900 90S 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3042 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(ix)/EATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1.. 3042 



895 



77 
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(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 17: 



ATG CAG TTC GTG AAC AAO CAG TTC AAC TAT AAn n^r- ^ 
Met Ola P».. val A.n X.y. OX„ Phe fJJ OCT 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 
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10 



15 



20 



25 



35 



40 



50 



55 



SI m iis - s c„ «c ^ 

30 

GTG AAG GCT TTC AAG ATT CAT Aar aa^ 

V.I ... ... ... S SI 2S ?s s S S s 

?s K ^? go° s; s: If; ss r f = 

50 " ^J-u Giy Asp Leu Asn Pro Pro Pro Glu 

GCA AAG GAG GTG CCA GTT TCA TAr nr^ 

Ala Lys Gin Val Pro vJl ler Sr I" ™^ 
65 70 Tyr Leu Ser Thr 

95 

CGT ATT TAT TCC ACT GAG CTG GGC CGT arr r^r* 

Arg lie Tyr Ser Thr Asp Gly So ^^S^ 

100 ^ jjf Thr Ser Il« val 

S IS S ?S IfJ „^ S5 - S 25 o^? K ^ 



ST i]I S5 is iS S iSS ?I? ?K If; 2J „^ S£ 



TTT ATC GAC AGC TTG CAG GAG AAC nar» TTr« -^^m 

«» ... «p s., ... =i„ JIS SI s s ^ ^ iS; 



96 



X44 



192 



240 



288 



336 



364 



432 



480 



^ 12 Si s; s ^ SI ?s jis s - - - SI S5 s 

ATC CAG TTT GAG TGC AAG AGC ttt rsr*r^ „^ . 

Xle axn PK. CXu SI Si ^ ^ 

170 

S iJI i?? IS JS SI ^ S - - ^ SI SI 
5SSI^SISISIJ||gailII2J-«|-S2S 

205 

15 IS IS J^l SI Si S IS SI S S Si SI SI 

'^'^^ 220 



S76 



624 



CTG ATC CAC GCC GGT CAT CGT CTG tat tm^ . 

..u XXe H.S ^. ox. Hi. S 22 §S ^ ^ it^S 

235 240 



SSSSI^SiSIJIIJilSI^SfSIS.'lJSIfJSi 

250 255 
GAA GTA AGC TTC GAG GAA CTG CGC ACQ TTC GGT nrr nr^n^ 
Glu va. ser Phe OXu Olu I,eu Arg ^ SI ^ ^11 



664 



265 
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15 



20 



25 



30 



50 



55 



300 

i ?S SI L^^ S - - - - s - - 

TAT CTC CTA TCT GAA GAT ACA TCT GGA AAA ttt rrr n'VT^ r^r.^ 

ry. L.U ..u s„ ».p .J s; E IS S3 2j s 



;s ?g ?s IS ^ s s ^ ^ jji - 



40 "-■ ^'O 47S 



480 



OAT CTA AAT AAA GGA GAA GAA ATT ACA TCT GAT ACT AAT a-ra 
A.P Leu A,n Ly. Gly Glu Glu lie Thr Ser ?S iin ll^ 

4S0 

s; s; s a |2 is s; ^ s 
SI s SI s; s =3 i^s Jii ?s s; s 22 



525 



912 



960 



Tyr Leu Leu Ser Glu Asp Thr Ser Glv Lys pA^ S« Val Asp ^ 

AAA TTT GAT AAG TTA TAG AAA ATG TTA ACA GAG ATT Tir a/-^ . 
Lx. Phe ASP uys Le« Tyr Ly. Met .eS ?II §?S 

350 



1056 



1104 



5S SIT s s ;ij Si i- - ^ - s - 

K ^ Sf S SI K s Sf S 5 Sf Jii I?? 

^ ti; 2j IS s £1 s IS !si ^ c s; s: Si iss 
s ui gfj s; sji ?e s j:i s isi si s ^ s 

410 

AAA AAT TTT ACT GGA TTO TTT GAA TTT TAT AAr T^n r-m »»v^ 
Lys A,« Phe Thr Gly Leu Phe ^ ™^ S° ^ ^J? ^ 

425 430 « 



1200 



1248 



1344 



» s; S5 SI 21 s 51 s ss "« 

460 

AAT AAT TGG GAC TTG TTT TTT AGT rrr Trz. n*« 

As« As„ ASP .eu Phe S ^ IS jJ^S 



1488 



1536 



1584 



1632 



AGT GAC ATT ATA GGC CAA TTA GAA CTT ATG CCT AAT ATA r^n ^ 
ser ASP lae He Gly Gin Leu Glu Leu §?S ^ 

535 540 

CCT AAT GGA AAA AAG TAT GAG TTA GAT AAA TAT ACT ATG TTC CAT TAT 
|ro Asa Gly Lys Ly. Tyr Glu Leu Asp Lys Tyr tS Met 5Se ^s IJr 

555 



80 
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20 



25 



"'^ 575 



AAT TCT GTT AAC GAA GCA TEA TTA AAT err ^r"r r>r,^ 

^ ..r v.. jjj ^. ^ 2; «T jrr w jjj 



50 79S 

TCA TAT TTA AIG AAT TCT ATG ATC CCT TAT QQT erf 
S« Tyr Leu «et Asn S« Het II. Pro ?^ ^ OAA 

610 — — _ 



815 



1728 



1776 



1824 



TTT TCT TCA GAC TAT GTA AAG AAA GTT AAT * 
Phe Ser Ser Asp Tyr Val Lys Lys vll ^ <^ GCT 

S9S ^ gJo Ala Thr Glu Ala Ala 

60S 

ATG TTT TTA GGC TGG GTA GAA CAA TTA cta -rn^r 

«« pj. L,„ «v v.. £ s; s; s: s 2j 

ACT AGC GAA GTA AGT ACT ACG GAT AAA ATT rnn ^^n^ 
Thr ser GXu v.l ser Thr Thr A^J ^™ JTA att 

640 

ATT CCA TAT ATA GGA CCT GCT TTA Aat atm • 

lie pro Tyr XI. Oly Pro A? S £^ 

670 

s: ^ jfi =s s IS s iK Si s 

''IS 720 
35 GTA ACA AAT TGG TTA GCA AAG GTT aat nr-n ««« _ 

vaa O^r Asn Trp AXa "A ATA ^ „o« 

730 735 
AAA AAA ATG AAA GAA GCT TTA GAA AAT CAA nnik r^,.* 

Ly. .y. „et .ys CXU AXa L.u Sit iJS SI 

?i: ?2 ^ J^' 2: J2 SS Si IX 2; 2J 

7ao 

785 790 ifSe ^^"^ Ser Val 



2064 



2112 



2400 



2448 



825 33^ 
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10 



25 



30 



40 



45 



50 



AAT AGA GGA ACT TTA ATT GOT CAA GTA cat ho* . 

^ Gj, T.. X.. "y S 2J s; ^ S5 SI 
SI 21 ?S S i^l S 2J ?IJ S ?" p "* «t» 

850 all ^-^-^ "° ™e Gin Leu Ser Lya Tyr VaX 

OAT AAT CAA AGA TTA TTA TCT ACA r^r^ 

».| »«. «» ^ 5S ?s s s ^ jjj ss 

CTG AAT TCC CCG G(3T GCA GCT CM TAT rer n»« ^.^ 

X.U As„ ser Pro Gly «a Ala SfJ ?S S?s' Sp^ §S SJ 

895 



^ 2; Jij s ^ s 2j £ ^ iss ^ 

GAA CAA CAA AAC GCG TTC TAT GAG atv* -m.. 

Glu Gin Gin A.„ Ali Si ?1^ SI JI"" 

965 ^ "^^ Asn Leu Asn 

GAA GAA CAA CGA AAC GCC TTC nnA 

Glu Glu Gin Arg Asn iia ?Je lie ^ 2Iu ^IJ? f'''' '^^ 
980 ote Asp Asp Pro Ser 

a; SI ii; s s ?s £s 5; j;: 2; SI 

1005 

GCG CCG AAA GTA GAC TAG 
Ala Pro Lys Val Asp • 
1010 *^ 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1014 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 



2544 



2592 



2640 



2688 



2736 



925 



2833 



26ao 



2928 



2976 



3024 



3042 



55 
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Met Gin Phe V.X A,n Lys Gin Phe Asn Tyr uys Asp Pro Val Asn Cly 
Val ASP lie Al. ryr Xle X.y. II. p„ 

val Lys Ala Ph. X.y. He His Asn Ly. He Trp Val He Z Glu Arg 

45 



83 
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Asp Thr Phe Thr Asn Pro Glu Glu » , 

50 ss " Pro Pro Pro cm 

€0 

Ala Lys Gin Val Pro Val th^^ ^ 

70 5« Thr Tyr tea Ser Thr 

ASP Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lv. r 

8S ' ~g "^^-^ Thr Lys Leu Phe Glu 

Arg He Tyr Ser Thr Asp Leu Gly Ara m^^ r 

100 " ^^^^ ««t Leu Leu Thr Ser He Vai 

^ oir u. o,, ^ ^ 

n; c,. „. ^ ^ 
^„ „. ^ 
ri. cx„ cv. s„ P.. „ 

175 

Arg Aan Oly Tyr Gly Ser Thr GXn OVr 11- Ar» ou „ 

Thr Phe Qly Phe Qlu Glu Ser JLeu oiu Val Asb Th^ « 

1»S 200 ^ ""^ JC«u lieu 

«; 01, L,. «„ „. ^ „. ^ 
US n. ai. «. oi, »^ ^ „^ ^ 

V.X ,^ ,^r^^^ 

255 

Olu v.. s.r jj. ^ ^ „ „^ 

270 

p.. XI. J.P s„ -~ ry, 

-y. *.p «. ^ ^ ^. ;;;; 

IJv „. ,„ T,. H.. x^. „^ 

ry. L.U x«u s„ oxu «p t,„ s.r ox, j„ ^ ^ ^ ^. ^ 

335 

Lys Phe ASP X.U ryr .ys Mac .eu n,r Glu Zle Tyr Thr Glu Asp 

350 

A.. Phe V.1 .y. p., .ys val .eu As« Ar^ .ys Thr Tyr .eu Asn 

Ph« ASP .ys Ala val Phe .,s Xle Asn Xle Val p.^ Val Asn Tyr 

Thr Xle Tyr Asp Gly Phe Asn Leu Arg Asn T^ Z .eu Ala Ala Asn 

400 
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Phe Asn Gly 61„ Asn Thr Glu He Asn As„ Met Asn Phe Thr Lys teu 
.ys Asn Phe Thr Gly .eu Phe Glu Phe Tyr Uys X,eu Leu Cys VaX Arg 
Gly ne ne thr Ser Lys Thr Lys Ser Leu Asp Vys Gly Tyr Asn Lys 
lie Glu Gly Arg Cys Asp Gly Ala Leu Asn Asp Leu ^s He x,ys Val 
Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn 
ASP Leu Asn Ly, Gly Glu Glu He Thr Ser Asp Thr Asn He Glu Al! 



495 



Al, Olu Glu As„ He Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr 

505 510 

Phe Asn Phe Asp Asn Glu Pro Glu Asn He Ser He Glu Asn L.u Ser 

525 

ser ASP He He Gly Gin Leu Glu Leu Met Pro Asn He Glu Arg Phe 

Pro Asn Gly Lys Ly. Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr 

SS5 

L-« Arg Al. Cln Glu Phe Glu His Gly Ly. Ser Arg xle Al. Leu Thr 

Asn ser Val Asn Glu Al. Leu Leu Asn p.o Ser Arg Val Xyr Thr Phe 

590 

Phe ser Ser Asp Tyr Val Lys Ly, Val Asn Ly. Al. Thr Glu Ala Ala 
Met Phe Leu Gly Trp Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu 
Thr ser Glu V.l Ser ^r Thr Asp Lys He Ala Asp He Thr He He 



^35 



640 



He Pro Tyr He Gly Pro Ala Leu Asn lie Gly Asn Met Leu Tyr Lya 

fiso 

ASP ASP Phe val Gly Ala Leu He Phe ser Gly Ala Val He Leu Leu 

Glu Phe He Pro Glu He Ala He Pro Val Leu Gly Thr Phe Al. ^eu 

85 

val ser Tyr He Ala Asa Lys Val Leu Thr Val ai„ Thr He Asp Asn 

Ala L.U ser Ly. Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr He 

■'^^ 720 
val Thr Asn Trp Leu Ala Lys Val Asn Thr ain He Asp Leu He Arg 

730 

Lys Ly. Met Lys Glu Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala 

745 750 



85 
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ne lie Asn Tyr Gin TVr Asn Gin Tyr Thr Glu Glu Glu .,3 A,„ 

7$S 

lie Asn Phe Asn He Asp Asp Leu Ser Sp^t- r 

770 ^ 77c Asn Glu Ser lie 

780 

A|n Lys Ala Met lie Asn He Aen Lys Phe Leu Asn Cln Cys Ser Val 

aoo 

ser ryr Leu Met Asn Ser Met xic Pro Tyr Oly val i,ys ^ ,eu Glu 

815 

ASP Phe A.p Ala ser Leu Lys Asp «a teu teu Lys Tyr He Tyr Asp 

830 

Asn Arg Gly Thr Leu He Gly Gin Val Asn t 

835 840 ^ ^ Val 

64 S 



Asn Asn Thr Leu Ser Thr Asp He Pro Ph- r 

850 oef Ser Lys Tyr Val 

660 

A*p A-n Oln Arg l^u r.eu Ser Thr Ph. Thr Glu Tyr He Lys Ser Gly 

gag 

Leu Asn Ser Pro Gly Ala Ala His Tyr Ala Gin Hi. Asp Glu Ala v.l 

895 

ASP Asn Lys Phe Asn Lys Glu Gin Gin Asn Ala Phe Tyr Glu rie Leu 

910 

His Leu Pro Asn Leu Asn Glu Glu Gin Arg Asn Ala Phe He Gin Ser 

Le« l.ya ASP ASP Pro Ser Gin Ser Ala Asn Leu Leu Ala Glu Ala Lys 

»J5 940 ^ 

Lys Leu Asn Asp Ala Gin Ala Pro Lys Val Asp Asn Ly. Phe Asn Ly, 

Glu Gin Gin Asn Ala Phe Tyr Glu He Leu Hi. Leu Pro Asn Leu Asn 

Glu Glu Gin Arg Asn Ala Phe He Gl„ 5er Leu Lys Asp Asp IZ Ser 



990 

Gin Ser Ala Asn Leu Leu Ala Glu Ala rv. t.«- . 

995 1000 ^ Gin 

Ala Pro Lys Val Asp • 
1010 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3509 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 

(A) NAME/KEY: CDS 



86 
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(B) LOCATION: 1.. 3509 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 19: 



10 



15 



20 



25 
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10 



15 



20 



25 



35 



ATG CCA GTT ACA ATA AAT AAT TTT AAT TaT ni^m r^i<^ ^ 
«.t Pr» v.> Th, n, A„ «„ E JJS ^ S JJJ 

10 ^5 

HI S S S? S - - ?S S5 ilS " 

AGA TAT ACT TTT GGA TAT AAA rrr ran r^R-r 
Arg T.r Thr Phe aXy r/r ^ 

s is s ^ a; - - - «c ™ 
s 2j £^ ^ - jii SI K js s jn it: s: 

90 95 

K ^ IS s sj ^ s s s s 



SJ 15 f S 5S S 2J - - - - s §K 

X25 



X92 



240 



288 



33€ 



384 



432 



s ^ ?s SI js - - SI - - ™ f« «r 

s; s; ^ 5; S5 s isi s s 
s gi gf? j2 ST Hi SI s; SI „^ s - - - - 

170 

?KS£JiI2ISISiSJJSS;c=^SI^§^t2SI?- 

ATQ AAG TTT TGC CCA GAA TAT era nnn i^* rruii,.. — 

».t c. p„ s; ^ ^ ™ Si SI SI SI s; Si 
JS^ §g Si s js[ ;s s s 2j - 

220 



480 



528 



S7€ 



624 



SI S Jli S £? SI Si - s - - ™ gj ooj 2j 

50 2*0 

S?ni^sit2isisi-ss^sigi^s;si - 

250 255 

. s^5^?-,^?^s^sI^Ii;^saisi--^s^s - 
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?s s s; lii s is^^is^i^r^ir. - 

300 

AAG GTT TTA GTT TGC ATA TCA GAT r<-r * 

V.X v.. ay. ni m til iS= K iSJ JJJ "J »J 

320 

AAA AAT AAA TTT AAA GAT AAA TAT iw* «~ 

Ly. ^» .V. «. Lj. ^ ^ S SI fil S Sf J 

335 

"^^^ 350 

S ^ SI 12 ^ IS ™ - - ™ 

360 



"5 400 



TCT GAT AAA GAT ATG GAA AAA GAA TAT Ann 
Ser Asp Lys Asp Met ^ ^ gJJ «J CAO AAT AAA GCT ATA 

405 A?? -'^^ ^^ys AI* He 

415 

iii s; ^ s; ^ sk s; s si ?g 

430 

ST S S S S 21 SI ^ «' - - 12 

SI s 12 IK is; ?s s; SI S s ss si 

«75 

^'J2s;si^ssissi^sjiis-?sis 

495 

TTA ATA AGT AAA ATA GAA TTA CCA AGT caa aat ^. 
X.e ser xie OXu LeS ^ |^ g?^ $S ^ 

SlSISISjr^SIiSS^^SlJ^SiJSSSJU^ 

525 

AAA ATT TTT ACA OAT GAA AAT ACC ATC TTT nn^K ^y.^ ^ 

Uys Ue Thr Asp «u ^„ ^ ^= J^^ 



960 



1008 



1056 



1152 



^jJ2ssisi^sfgij£™--s?sisis: 



1248 



12S5 



1344 



1392 



1440 



1536 



1584 



1832 
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15 



25 



«« 670 



BBS 



1728 



S SI S S is S S S S S IS 

GAT GCA TTA TTA TTT TCT AAC AAA GTT tat t^-in « 

ASP Ma .e« .eu PHe Ser ^1 SI S S2 

570 

TGG GTG AAA CAG ATA GTA AAT CSAT TTT GTA Arr ran ^n-r 
Trp vax ^y. oin II, V,l As„ SI S IJi) ^fl 

AAT ACT ATG GAT AAA ATT GCA GAT ATA TCT rra «~ 
Asn n« Met A,p .ya X.e Ala A^J J2 ^^^l^o'^ 

§S IS 51? ^ ^ ^ ° CCA AAT rrx OAA 

SI S S„= J n S „^ S - S 2; S; - rr. „j ccj 

IK 21 s f 2 SI s ST §s SI s s: s; SI K 



1824 



1872 



1968 



2016 



2064 



2112 



2160 



220B 



s; IS J2 J- ji? - - ™ }u s s: IK s 

700 

S J2 iS SI iSI ^ s; - 1^' s - - s - - 

715 720 

SI iji s; sj a; g; s aj si ?s s k ?s 

40 '^^ 730 735 

is^'KiJs^ssij^^j^iiJisijnissfiisi 

745 750 

SlillSItKiJIISSJSIJ^ISS^^K^SJSI?!! - 
SI ^I J2 UI iii SI ?I| iSi ^ IS 12 S JI? 

^ is? s sj; s SI St IS ^ s; 2; 25 s si 

795 300 

JSS^J^iJISSSJJISISSISIiSIi^Sl^' 

810 55^5 



90 
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15 



20 



25 



35 



TTG ATT GGA AGT GCA GAA TAT GAA AAA TrA r^^-^ , 

Leu lie Gly Ser Ala 0?^; Tyl gIS SJ^ Ser vll ^ ™^ ™^ 

820 fl?f " Tyr Leu 

^-^^ 830 

AAA ACC ATT ATG CCG TTT GAT CTT TCA ATA TAT Ann i^^-r 
.ys Xhr XI. „ec ,ro P.e Asp Leu s3 ^ 

»40 845 

CTA ATA GAA ATG TTT AAT AAA TAT AAT AGn nan nn*^ ^« . 

L.U ne oa« Me. P.e ^„ .y. ™ ATX 

ATC TTA AAT TTA AGA TAT AAG GAT AAT AAT tta a^rm 

ne .e« A.„ Arg Tyr .y, A^ JIJ t^l J^^ 

S Si J?? IK S2 i^^ §S Sf §S 21 its S| ^ 

Si SI 5^ Si ?s s s iss ^' s 

* 910 

CAA AAT CAG A*r ATC ATA TTT AAT AOT QTO TTC CIT GAT TTT i^fn «^ 
Gin Asn Gl„ A.n He la. Ph. A.« Ser Val ^ 21 ^ 

925 

5S s: ti2 s K 5S ^ ^ s| m il- 

g| S SI 21 ^ iS S !2I ?S S - - - IS 

s SI s §s j;; jss s ?s 5S s K 

J?&5 970 



50 ^^^^ ^^^^ 3.0-55 



2496 



2544 



2640 



2688 



2736 



2784 



2632 



2880 



2928 



GAT ATA AAT GGA AAA ACC AAA TCG GTA TTT TTT GAA TAT Aao a^i. ^r^^ 
ASP lie Asn Gly Lya Thr Lya Ser Val l^l l^e xl^ j^S 

r 985 
GAA GAT ATA TCA GAG TAT ATA AAT AGA TGG TTT TTT eSTA Ar-r a^-f 

01« A,p la. ser Glu Tyr He Asn Arg ^ ^ ^ ^ ^ 
40 1000 lOQS 

AOT AAT TTQ AAT AAC OCT AAA ATT TAT ATT AAT GGT AAG CTA GAA TCA 
Asn Asn^Leu Asa A.n Al. Lys^Ile Tyr He Asn aiy^Lyt ^ I« 

AAT ACA GAT ATT AAA GAT ATA AGA GAA GTT ATT carr 
A,n Thr A«n ri« i.v. Asp^n. Arg S^IJ ^ ^ ^ 



2976 



3024 



3072 



55,"" fs,"' ^ ^ ; .SI !s; IS a; Ji^ 

1035 X04C 
ATA TTT AAA TTA GAT GGT GAT ATA GAT AGA ACA CAA TTT ATT TGG ATr ^n^o 
lie Phe Lys Leu Asp Gly Asp lie Asp Arg Th? 5?S SI x" ?S Met 



3216 



3264 



91 
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10 



- s sj? ^ - - - «j 

^ip s s ^ s; ^ fjj - sr^s ?2 - 

ATT TTA ACA CGT AGC AAA TAT AAT CAA AAT TOT a^a . 
lie Leu Thr Axg s« Lys Tyr Asn IT. JJJ ^ JjJ^ JJJ JJJ 



1130 i£?fs 
AGA GAT TTA TAT ATT GGA GAA AAA TTT ATT iti . 

«p XXJ Sl„ ^ ?S S ^ ^ IS JJJ 

TCT CAA TCT ATA AAT GAT GAT ATA ctt nr^a nui^ 

Ser Gin Ser He Asn AsJ Asp lie v2 A^ I^^ TAT 

H55 ^ iifio ^ -^«P Tyr He Tyr 

^^^^^ 1165 

CTA GA 

20 Leu 



25 



(2) INFORMATION FOR SEQ ID NO: 20: 
(i) SEQUENCE CHARACTERISTICS: 



3456 



3504 



3509 



(A) LENGTH: 1169 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

35 



40 



45 



50 



55 



92 
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Met Pro val Thr XXe Asn A.« PHe A.n Ty, ^„ 
Asn A.n II. lie Met Met Glu Pro Pro Phe AI. Arg Gly xhr g" Ar^ 
Tyr ryr .ys Ala Phe X.ys XXe Thr Asp Arg tx. xXe ixl Pro GXu 

Arg Tyr Thr Phe Oly Tyr Lya Pro Olu Aap Phe Aan Lya Ser Ser Cly 

lie Phe Asn Arg Asp Val Cys Glu Tw tx,^ . 
65 70 ^ ^ ^f? Pro Aap Tyx Leu Asn 

Thr Asn A.P ty. tys Aan He Phe Leu Gin Thr Met 11, x.y. ueu Phe 

90 p5 

Asa Arg He Lys Ser Lys Pro Leu Glv Glu r.v« r..,. t 

100 fii Qlu Met He 

1X0 

lie Aan Gly lie Pro Tyr Leu Gly Aap Arg Arg v.l Pro Leu GXu Gl« 
Phe Aan Thr Aan He Ala Ser Val Thr Val Aan Ly. l!u He Ser Aan 
pro Oly Glu Val Glu Lys Lys Gly He All Asn Leu He He 
Phe GXy Pro Gly Pro Val Leu Aan Glu Aan Glu Thr Xle Aap He Gly 



X70 



93 
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na Gin Aan His Phe Ala ser Arg Cl. Oly Ph. cly cly ixe Met CXn 

Met Lys Phe Cys Pro Glu Tyr Val Ser Val Phe As„ Asn Val Gin GIu 

205 

Asn I.y, Gly Ala Ser He pje Asn Arg A^ cly jy. pHe Se. Asp P,o 

Al. Leu He Leu Met His Glu Leu He Hi. Val Leu His Gly Leu Tyr 

240 

Gly lie Lys V.1 Asp Asp Leu Pro He Val Pro Asn Glu Ly. Lys Phe 
Phe Met Gin s« rhr Asp Ala He Gin Ala Glu Glu Leu Tyr IZ Phe 

270 

Gly Gly Gin Asp Pro Ser He He Thr Pro Ser Thr Asp Lys Ser He 

285 

Tyr ASP Ly. Val Leu Gin Asn Phe Arg Gly H. Val Asp Arg Leu Asn 

Ly. V.1 Leu vax Cy. lie Ser Asp p«, J" 

' 320 

Lys Asn Ly. Phe Ly. Asp Lys Tyr Ly. Phe Val gXu Asp Ser Glu Gly 

Ly. Tyr ser He Asp Val Glu Ser Phe Asp Ly. Leu Tyr Ly. lH Leu 

350 

Met Phe Gly Phe Thr Glu Thr Asn He Al. aiu A.n Tyr Ly. He Ly. 



Thr Arg Ala Ser Tyr Phe Ser Asp. Ser Leu Pro Pro Val Ly. He Ly. 

380 

Asn Leu Leu Asp Asn Glu He Tyr Thr He oiu Glu Gly Phe Asn H. 

ser ASP Lys Asp Met Glu Lys Glu Tyr Arg Gly Gin Asn Ly. Ala He 

415 

Asn Ly. Gin Al« Tyr Glu Glu He Ser Ly. Glu HI. Leu Ala Val Tyr 

430 

Ly. He Gin Met Cys Ly. Ser Val Lys Ala Pro Gly He Cy. He Asp 

val ASP A.n Glu Asp Leu Phe Phe He Ala Asp Ly. Asn Ser Phe Ser 

460 

A|P Asp Leu ser Ly. Asn Glu Arg He Glu Tyr A.« Thr Gin s.r Asn 
ryr He Glu Asn Asp Phe Pro He Asn Glu Leu H. Leu Asp Thr Asp 
Leu He ser Lys He Glu Leu Pro Ser Glu Asn Thr Glu Ser Leu Thr 



510 



ASP Phe Asn Val A«p Val Pro Val Tyr Olu Lya Gin Pro Ala Il« r,ys 

520 



94 



EP 0 939 818 B1 



Lys lie Phe Thr Asp Glu A.n Thr lie Ph. QXn Tyr Leu Tyr Ser Gin 

540 

Thr Phe Pro Leu Asp lie Arg Asp He Ser Leu Thr Ser Ser Phe Asp 

ASP Ala Leu Leu Phe Ser Asn Lys Val Tyr ser Phe Phe Ser Met Asp 

Tyr lie Lys Thr Ala Asn Lys Val Val Glu Ala Gly Leu Phe Zl Gly 

590 

Trp val Lys Gin He Val Asn Asp Phe Val He Glu Ala Asn Lys Ser 

605 

A.„ Thr Met ASP Lys He Ala Asp He Ser Leu He Val Pro Tyr He 



$20 



Cly Leu Ala Leu Asn Val Gly Asn Glu Thr Ala Lys Gly Asn Phe Glu 

Asn Ala Phe Glu He Ala Gly Ala Ser He Leu Leu Glu Phe He Pro 

Glu Leu Leu lie Pro Val Val Gly Ala Phe Leu Leu Glu Ser Tyr He 

670 

Asp A.n Lys Asn Lys He He Lys Thr H. Asp Asn Ala Leu Thr Lys 

665 

Arg Asn Glu Lys Trp Ser Asp Met Tyr Gly Leu He Val Ala Gin Trp 

70S "^'^ ^0 ^"^^ """^ i"-' '-y- «y H« Tyr 

720 

Lys Ala I^u A«« Tyr Gin Ala Gin Ala Leu Glu Glu He He Lys Tyr 
Arg Tyr Asn lie Tyr Ser Glu Lys Glu Lye Ser Asn He Asn He Asp 
Phe Asn Asp He Asn Ser Lys Leu Asn Glu Gly He Asn Gin Ala He 

760 

Asp Asn He Asn Asn Phe He Asn Glv Cvs Smr- vtii ^ . 

770 77e ^ Ser Tyr Leu Met 

780 

Lys Lys Met He Pro Leu Ala Val Glu Lys Leu Leu Asp Phe Asp Asn 

800 

Thx Leu Lys Lys Asn Leu Leu Asn Tyr lie Asp Glu Asn Lys Leu lyr 

810 815 
Leu Xle Gly Ser Ala Glu Tyr Glu Lys Ser Lys Val Asn Lys Tyr Leu 

830 

Lys Thr He Met Pro Phe Asp Leu Ser He Tyr Thr Asn Asp Thr He 

845 

Leu He Glu Met Phe Asn Lys Tyr Asn Ser Glu He Leu Asn Asn He 

860 

He Leu Asn Leu Arg Lys Asp Asn Asa Leu He Asp Leu Ser Gly 



95 
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TV. «. V.1 OU V.X T,. „u ^„ 

Asn Gin Phe Lys Leu Thr Si>t- c«». »t 

sSo ^•''^ ^^'^ Arg val Thr 

am Asn cin A,n xZe Xle Phe Asn Ser Val Ph. .eu Asp Phe Ser Val 
Sar Pje Trp He Arg He Pro x,y, Tyr Lys Asn Asp GlJ xie ai„ Asn 
Tyr He His A.„ GI« T^^ Thr He lie A.„ Cy. ,,.3 

960 

OXy Trp x,ys Xle Ser X.e Arg c.y Asn Arg Xle He Thr x.u Xle 
ASP He Asn Gly .ys ^r .ys Ser Val Phe Phe Clu ryr Asn Xle Arg 



Clu ASP Xle ser Cl« Tyr Xle A,„ Arg Trp Phe Phe Val Thr xie Thr 

Asn Asn Leu Asn Asn Ala Lv« xi« » 

iOiS 1Q20 
Jsn^Thr ASP rie X.y. Asp^rie Ar« aXu VaX XXe A.. A.„ Oly axu XXe 

1040 

He Phe I.ys Leu A^sp^Gly Asp He Asp Arg Thr Cln Phe xle Trp Mae 

1O50 20S5 

Ly. IVX Phe s« Xle Phe Asn Thr oXu Leu Ser GX„ s.r Asn xxe ox« 

1070 

Gl« Arg Tyr Lys He Gin Ser Tyr ser Glu Tyr Leu Lys Asp Phe Trp 

Oly Asn^Pro Leu Mec Tyr Asn^Ly. Glu Tyr ryr Met^Phe Asn Ala Gly 

Asn^Lys As„ ser Tyr He^Lys Leu Lys Lys Asp Ser Pro Val GXy oXu 

Ills ^120 
He L.U Thr Arg Ser^Lys Tyr Asn GX„ Asn Ser Lys Tyr xle Asn Tyr 

il30 

Arg Asp l.eu Tyr xie Gly Glu Ly. Ph. xie XI. Arg Ax^ Lys Ser Asn 

^^^^ 1150 
ser Gin s« Xle Asn Asp Asp He^val Arg Lys Glu Asp^Tyr He Tyr 



(2) INFORMATION FOR SEQ ID NO: 21: 
({) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2574 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



96 
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(ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION:!. .2574 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 



97 
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15 



25 



40 



45 



50 



|f| s s js? 5^ s; s s X SI Si sti m ^ '» 

240 



192 



240 



ATG CCA GTT ACA ATA AAT AAT TTT AAT TAT AAT n^T nn^ ^ 
Met Pro Vai Thr He Asn Asn Phe AsJ ?^ S 2^ 

10 ^5 

AAT AAT ATT ATT ATG ATG GAG CCT CCA TTT rrr ar-a . 
Asn A-n lie lie Met Met Glu Pro S 7hI fit ^ S° 

TAT TAT AAA GCT TTT AAA ATC ACA GAT nr^ 

Ty. TV. «. P.. u. S S S tli Si 

AGA TAT ACT TTT GOA TAT AAA ncT r*^^ 

ovj ™, p.. S2 S iSJ !^ IS SJ 

ATT TTT AAT AGA OAT GTT TGT GAA TAT nai* z^-,,* 

11. p.» A.. «, ..p Si ?^ »j ^ ;^ Hi 

ATA AAT GGT ATA CCT TAT CTT GGA GAT nn^ . .j.. . 

XXe Acn Cly Xle Pro ^ ^ ^ CTC OJA OAC 
CCA GGA GAA GTQ GAG CGA AAA AAA GGT ATT t-rr- 

Pro Gly Glu Val Glu Arg Lya Lys 5lJ rJl ST t^"^ ''^^ ^™ 
145 ISO feS ^'^^ 

S =^ S §g ?S S iSS S; iiJ SS S J2 £S s §s 

190 

ATG AAG TTT TGC CCA GAA TAT GTA AGf* r-r* m.._ 

Ly. Pg cy. pro Olu v2 5S ^ S Jil SIIK gi 

20S 

AAC AAA GGC GCA AGT ATA TTT AAT AGA CGT GGA TAT TTT Tr« /-n-r r.^» 
^ys Gly Ala Ser He Phe Asn Arg Arg SJ ^ 



480 



528 



576 



624 
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15 



25 



40 



50 



55 



?5 ^ SI IS S ^ 2? J2 S 52 S ^ 



«. „3 



!S ^ Si SS ?S SK Si in ^ ?S SI S SI S SI 

5? ?s ^ Si? ^ ^ ss Si „^ SI S iu m 



460 



816 



864 



912 



m s ^ s s s s s jii gj - 

2SS 

S5I?S5I5I?SSISIJSSSSia;o=SS5^?SSI 

270 

GGA GGA CAA GAT CCC AGC ATC ATA ACT CCT Trr a/-^ 
Cly Gly Gin Asp Pro Ser He ^ ^ S 

2B0 285 
TAT GAT AAA GTT TTG CAA AAT TTT AGA GGG ATA GTT GAT *ra oi-r 
ryr Asp l.ys Val Leu Gin Aen Phe Arg a?y Hi SS 2? ^ S JJS 

300 

320 

Ki:iJ^Si^2iijigi{^sj?irs;sj-a2s?; - 

e ^' 5S fR 2J SJi SI ^ JIJ J«I - ™ 

SI? s =^ s s s; SI ^i trs s; jsj si 5 JS 



1104 



13.52 



30 375 

ZU Hi S S S SJ? Si S S Jil ?S 

395 

» - 2i 2J ^1 a; - Si I^^- i?J =^ iSi ^ SI s 



1248 
1296 
1344 



« ?S^:XiS12SIi;SSIJ2SISJJJiXl-si2 



GAT GAT TTA TCT AAA AAC GAA AGA ATA CAA TAT AAT ArA 
ASP ASP x.eu ser Lys Asn Gl« Arg He ^ 

SI JiJ s; iii ^ IS in s jsi s; n; ^s si si s si 
Hi tii s Si Si ss S SI ?e Si IS SI 



1536 



510 
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10 



15 



25 



30 



35 



520 525 
AAA ATT TTT ACA GAT GAA AAT ACC ATC TTT r^A tut T^n ^» 
Lys Ue Phe Xhr A,p oiu As„ rS ill S SJJ 2J 2^ 

=>J^ 540 

ACA TTT CCT CTA GAT ATA AGA GAT ATA AGT TTi ar* -rr^ 

Xhr Phe Pro r.u Asp iXe ^..^ Asp S S S 12 IS S 

SI SI SI s s IS s IS ;n 2j 

?^^?s^ss^j^s?ssS-§gsiisS§s 
SI? £51 s s n: IS s s SI SI ii: iji 

605 



1624 



1872 



AAT ACT ATG GAT AAA ATT CCA GAT ATA TCT CTA ai-r r^r^ 
Asn Met A.p Lya IXe Al. A^ z2 I^ ^IT S ^It 

^ 620 
GCSA TTA OCT TTA AAT GTA GGA AAT GAA ACA arr iaa n^m ^ 

Gly Leu Ala Leu A.n V.X cly A-« IS ^ °S SI 

in SI SI s: s; sf s s s SI 5K s 
s; s SI ?2 s s SI sj S SI SI SI is 5 ts 

S^JlI^JiJSIJSS^JSSSJiSSSISIJSI^ »" 

oou 685 

^i^si£ji?ssgjsifs?o^2is?ss;o^s - 



1968 



2016 



720 

" ^ s ?^ IS s; sji s; s Si SI SI 52 ij; ^ - 

AGA TAT AAT ATA TAT TCT GAA AAA GAA AAG TCA AAT ATT Aan n-ri^ r.«* 
Arg .Vr Asn lie Tyr Ser Glu Lys gK jSJ^ ?S i?J 

?SJII||SJIIlSJ^SiSISSSJSIISS?ISS »" 

765 

GAT AAT ATA AAT AAT TTT ATA AAT GGA TGT TCT GTA Tr* -rn^ tt* 
55 Asp Asn lie Aan Acn Phe lie Asn Glv Cva si; SI^ I 2^52 

770 77I ^ T/r lieu Met 



100 
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"^^^ 800 

ACT CTC AAA AAA AAT TTG TTA AAT TAT ATA <3AT GAA AAT AAA TTi. 

Thr Leu Lys Lys Asn Leu Leu Asn Tyr He XsJ ^ Tyr 

810 ais 

TTG ATT GGA AGT GCA GAA TAT GAA AAA TCA AAA GTA aat -n*/- 

Leu lie Oly Ser Ala Glu Tyr Glu Lys l^r ^ ^ ^11^ 



15 



2496 



2544 



AAA ACC ATT ATG CCG TTT GAT CTT TCA ATA TAT ACC AAT gat hr^ 
Lys Thr lie Met Pro Phe Asp Leu Ser He ?hr ili JS 

840 

CTA ATA GAA ATG TTT AAT AAA TAT AAT AGC 

Leu He Glu Met Phe Asn Lys Tyr Asn Ser ^^''^ 
850 ess 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 858 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 



35 



40 



45 



50 



55 
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Met Pro val Thr He A.n Asn Phe Asn Tyr Asn Asp Pro He Asp Asn 

10 j^g 

Aan Asn He lie Met Met GIu Pro Pro Phe Ala Arg OXy Thr Oly Arg 
Tyr Tyr Lys Ala Phe Ly. lie Thr Asp Arg He Trp He He Pro Qlu 

Arg Tyr Thr Phe Gly Tyr Lyn Pro Glu Asp Phe Asn Lys Ser Ser Gly 

60 

IXe Phe A.„ Arg Asp Val Cys Glu Tyr Tyr Asp Pro Asp Tyr teu Asn 



7S 



80 



Thr Asa Asp Lys Lys Asn He phe Leu Gin Thr Met He Lys L«. phe 

Asn Arg He Lys Ser Lys Pro Leu Gly Glu Lys Leu Leu Glu Met He 

He Asa Gly He Pro Tyr Leu Gly Asp Arg Arg Val Pro Leu Glu oiu 

120 X25 
Phe Asn Thr Asa He Ala ser Val Thr Val Asa Ly. Leu He Ser Asn 



140 



Pro Gly Glu Val Glu Arg Lys Lys Gly He Phe Ala Asn Leu He He 

Phe Gly pro Gly Pro V.l Leu Asn Glu Asn oiu Thr He Asp He Gly 

170 175 
He om Asn His Phe Ala Ser Arg Glu Gly Phe Gly Gly He Met Gin 

105 j^gQ 
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Met Lyn Phe Cys Pro Glu Tyr Val Ser Val Phe Asn Asn Val Gin Glu 



205 



Asn Ws Cly Ala Ser He Phe Asn Arg Arg Gly Tyr Phe Ser Asp Pro 



220 



Ala Leu lie Leu Met His Glu Leu He His Val Leu His Gly Leu Tyr 

2S0 2-55 



235 240 
Gly He Lys Val Asp Asp Leu Pro He Val Pro Asn Glu Lys Lys Phe 

Phe Met Gin Ser Thr Asp Ala He Gin Ala Glu Glu Leu Tyr Thr Phe 

265 270 

Gly Gly Gin Asp Pro Ser He lie Thr Pro Ser Thr Asp Lys Ser lie 

2B0 28S 

Tyr ASP Lys Val Leu Gin Asn Phe Arg Gly He val Asp Arg Leu Asn 

300 

Lys val Leu Val Cys He Ser Asp Pro Asn He Asn He Asn He Tyr 

315 

Lys Asn Lys Phe Lys Asp Lys Tyr Lys Phe Val Glu Asp Ser Glu Gly 

330 

Lys Tyr Ser He Asp Val Glu Ser phe Asp Ly. Leu Tyr Lys Ser Leu 

3*S 350 
Met Phe Gly Phe Thr Glu Thr Asn He Ala Glu Asn Tyr Ly. tie Lys 



365 



Thr Arg Ala Ser Tyr Phe Ser Asp Ser Leu Pro Pro Val Lys He Lys 
Asn Leu Leu Asp Asn Glu He Tyr Thr He Glu Glu Gly Phe Asn He 
Ser ASP Ly. Asp Met Glu Lys Glu Tyr Arg Gly Gin Asn Lys Ala He 

410 

Asn Lys Gin Ala Tyr Glu Glu He Ser Lys Glu His Leu Ala Val Tyr 

425 430 

Lys He Gin Met: Cys Lys Ser Val Lys Ala Pro Gly He Cys He Asp 



445 



val ASP Asn Olu ASP Leu Phe Phe He Ala Asp Lys Asn Ser Phe ser 

4SS 4fio 

Asp Asp Leu ser Lys Asn Glff-Arg He Glu Tyr Asn Thr Gin Ser Asn 

^ 475 400 

Tyr He Olu Asn Asp Phe Pro He Asn Glu Leu He Leu Asp Thr Asp 

4«5 490 

Leu He Ser Lys He Glu Leu Pro Ser Glu Asn Thr Glu Ser Leu Thr 
>00 505 510 

Asp Phe Asn val Asp Val Pro Val Tyr Glu Lys Gin Pro Ala He Lys 

520 525 

Lys lie Phe Thr Asp Glu Asn Thr He Phe Gin Tyr Leu Tyr Ser Gin 



540 
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Thr Phe Pro Z.eu A.p iXe Ar^ Asp xie ser Leu xhr Ser s.r Phe Asp 

560 

ASP Ala .eu Z,eu Phe Ser Asn X.ys Val ^r Ser Phe Phe Ser „ec Asp 
Tyr lie Ly, Tjr Ala A.n Lys Val Val Glu Ala aXy :.eu Phe Al^a Gly 
Trp val jys Gin He Val Asn Asp Phe Val He Glu Ala Zl l,y. ser 

WWW gQg 

Asn Thr Met Asp Lys lie Ala Asp He Ser Leu He val Pro Tyr He 

^•^^ 620 
GXy lieu AXa Leu Asn Val Gly Asn Glu iki^ t 

62S ^ Gly Asn Phe GXu 

640 

Aan Ala Phe Glu He Ala Gly Ala Ser He Leu Leu Glu Phe He Pro 

6SS 

Glu Leu Leu He Pro Val Val Glv Ala PKi. r*., t 

660 ^ ^^'^ Ser Tyr He 

"5 670 

ASP Asn Lys Asn Ly. He He Lys Thr He Asp Asn Ala Leu Thr Ly. 

665 

Arg Asn Glu Lys Trp Ser Asp Mee Tyr Gly Leu He Val Ala Gin Trp 

700 

Leu ser Thr Val Aen Thr Gin Phe Tyr Thr lie Lys Glu Gly Met Tyr 

720 

Ly. Ala Leu Aaa Tyr Gin Ala Gin Ala Leu Glu Glu He He Lys Tyr 
ATB Tyr Asn lie Tyr Ser Glu Lys Glu Lys ser Asn He Asn He Asp 
Phe Asn ASP He Asn Ser Lys Leu Asa Glu Gly He Asn Gin Ala He 



765 



Asp Asn He Asn Asn Phe He Asn Gly Cys Ser Val Ser Tyr Leu Met 

Lys Lys Met He Pro Leu Al. v.l Glu Lys Leu Leu Asp Phe Asp Asn 

Thr Leu Lys Ly, Aan Leu Leu A«a Tyr He Aap Glu Asa Lys Leu Tyr 

810 

Leu He Gly Ser Ala Glu lyr Glu Lys Ser Lye Val Asn Lys Tyr Leu 

Ly. Thr He Met Pro Phe Aap Leu Ser He Tyr Thr Asn Asp Thr He 

845 

ilf Aan Lya Tyr Asn Ser 

850 855 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1644 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION:1..1644 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23; 



105 
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15 



20 



25 



30 



35 



ATG CCA GTT ACA ATA AAT AAT TTT AAT TAT AAT rax r-r^-r t.-^ 

Met pro V.1 Thr lie Asn Asn Phe ^cl S Hi JJJ 

15 

AAT AAT ATT ATT ATG ATG GAG CCT CCA TTT rrr r^r^^ « 

Asn Asn lie lie Met Met Glu pS L^I gf^ ^ 

TAT TAT AAA GCT TTT AAA ATC ACA GAT CGT ATT Tnr nxn t.^^ r^^^ ^ 
ryr Tyr Lya Ala Pha Lys He Th. ^ l" ?S lie ^I^ ^ 



40 „ 



ISO 



48 



9.6 



144 



192 



AGA TAT ACT TTT GGA TAT AAA CCT GAG GAT TTT 2i&t m/^m 
Arg Tyr Thr Phe GXy Tyr ^y. Pro ^ ^1 i^^!^ 

S Jil SI ?S iti ??5 K iSJ - 

AAT AGA ATC AAA TCA AAA CCA TTQ GGT GAA AAG TTA TTa n^r- . 

Aan Arg He Lys Ser Ly, Pro Leu Gly llS ^ 

ts ^ ^r^ s; ^ sr 5S s sjj ' •« 

120 

TTT AAC ACA AAC ATT GCT AGT GTA ACT «TT aaT niii^ 

Phe Asn Thr Asn He Ala Ser 51? 511 tl^ jj^jj 

CCA GGA GAA GTQ GAG CGA AAA AAA GGT ATT TTC rrA aat -t^» 

Pro GXy GX« V.X OXu Arg Ly. ^ J^* ATA ATA 4B0 

s §?; SI s; iss si - jii - 



528 



576 



624 



ATG AAG TTT TGC CCA GAA TAT GTA AGC GTA TTT AAT AAT GTT taa ri.^ 
Met Lys Phe Cys Pro Glu Tyr Val Ser Val SI ^^1 §lS ^ 

" 200 20S 

45 AAC AAA GGC GCA AGT ATA TTT AAT AGA CGT GGA TAT TTT TCA GAT CCA 

A»n Lys GXy Ala Ser He Phe Asn Arg Arg Gly ^ ^ I^J ^ 



50 



55 
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10 



15 



20 



25 



45 



50 



55 



GCC TTG ATA TTA ATG CAT GAA CTT ATA CAT (TET TTA CAT GGA Tta ^^t- 
Ala Leu lie Leu Met His Glu Leu He His S S ^ Su ™J 

240 

s s ?s s s s i^i sti ^ s " 
s s? Si ^ ?s s s 0^ 5 s; §x Si S fj 
s; s?; 1^ s s s js S s ?u •« 

300 

s Jii 5s s IIS jii - - s 

315 

AAA AAT AAA TTT AAA GAT AAA TAT AAA TTC GTT raa n^^r i.^m ^.^ 

Uys Asn Ly, Phe Lya Asp Uys Tyr v" 1^ ^fj 

AAA TAT ACT ATA GAT OTA OAA AGT TTT GAT AAA TTA TAT aa» ir-/. 
Vyn Tyr Ser lie Asp Val Glu Ser Phe As? ^ ZIS ^Jl^ 

34S 

iS? S S ?S Si ?S fJJ S JS Si JiS Si ^ 

360 

S ^ SI S S ^ IS s ^ S is 

360 



430 



445 



GAT GAT TTA TCT AAA AAC GAA AGA ATA GAA TAT AAT ACA CAG AGT aat 
Asp ASP Leu ser Lys Asn Glu Arg He GXu ^ Qln S« 

475 480 

TAT ATA GAA AAT GAC TTC CCT ATA AAT GAA TTA ATT TTA GAT ACT GAT 
Tyr He Glu Asn Asp Phe Pro He Asb Glu Leu xl^ AaJ ThJ 

490 



912 



960 



1152 



1200 



35 T^^ ™ ATC TAT ACT ATA GAG GAA GGG TTT AAT kti. 

Asn I.eu Leu Asp Asn Glu He Tyr Thr He g?S §?S ^ 

395 400 

TCT GAT AAA GAT ATG GAA AAA GAA TAT AGA GGT rAr aat 1.1.1. 
ser ASP uys Asp Met Glu X,y. Glu 1^^ ^ ^ ^JI 

40 415 
AAT AAA CAA GCT TAT GAA GAA ATT AGC AAG GAG CAT TTr rr-r 
Aan Ly. Gin Ala Tyr Glu Glu lie Ser ^ Si I ^ All 



AAG ATA CAA ATG TGT AAA AGT GTT AAA GCT CCA gga hth *rr^ 
Lys lie Gin Mat Cys Ly. Ser Val J?? SI f ^ ^ ^ 



GTT GAT AAT GAA GAT TTG TTC TTT ATA aCT oaT aan H»m »^ ~ 

vel ASP Asn Glu ASP Leu Phe Phe SI (J? 511 

«55 450 



14B8 



495 



107 
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S ?IJ £ Hi Si Ili - - - - s; cj. 

S S S - s; - - ^ 



ACA TTT CCT CTA 
Thr Phe Pro Leu 
545 



(2) INFORMATION FOR SEQ ID NO: 24: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 548 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(if) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
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Met Pro Val Thr He A.n Asn Phe Asn Tyr Aan Asp Pro He Asp Aar, 

10 LS 

Asn Asn He lie Mac Met GIu Pro Pro Phe Aia Arg aiy Thr Gly Arg 



30 



Tyr Tyr Ly. Ala Phe Lys lie Thr Asp Arg He Trp He He Pro Gl« 



45 



Arg Tyr Thr Phe Gly Tyr Lys Pro.Glu Asp Phe Asn Lys Ser Ser Gly 

lie Phe Asn Arg Asp Vsl Cys Glu Tyr Tyr Asp Pro Asp Tyr t.u Asn 

75 go 

Thr Asn Asp Lys Lys Asn He Phe Leu Gin Thr Met H. Lys Leu Phe 
as J 

Asn Arg He Lys Ser Lys Pro Leu Gly oiu Lys Leu leu Glu Met He 

105 ^2.0 
He Asn Gly He Pro Tyr Leu Gly Asp Arg Arg V«l Pro Leu Glu Glu 

120 125 

Phe Asn Thr Asn He Al. Ser v«l Thr v.l Ash- Lys Leu He Ser Asn 

135 

Pro Gly Glu Val Glu Arg Lys Ly. Gly He Phe Ala Asn Leu He He 

ISS ISO 

Phe Gly Pro Gly Pro Val Leu Asn Glu Asn Glu Thr He Asp He Gly 

170 

He Gin Asn His Phe Ala Ser Arg Glu Gly Phe Gly Gly He Met Gin 

185 

Met Lys Phe Cys Pro Glu Tyr Val Ser Val Phe Asn Asn Val Gin Glu 

200 205 
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Asn jys Gly Ma Ser He J.e Asn Arg Ax^ oly Tyr Phe Ser Asp Pro 

^ 220 

Jl. II. ^„ ^ 

Oly II. V.1 J.P ^ 

255 

Phe Met Gin s« Thr A«p Ala lie Gin a1. clu GIu teu Ty. rnr Phe 

Gly Gly Cln Asp Pro 3.r Xle lie thr Pro ser Thr Asp ser lie 

285 

^r ASP Lys val Leu Gin Asn Phe Arg Gly xie VaJ Asp Arg Leu Asn 

jy. V.1 L.U v.1 cys tie ser Asp Pro Asn lie tie Asn lie jyr 

Ly. Asn Lys Phe Ly. Asp Lys Tyr Lys Phe Val Glu Asp Ser Glu Gly 

X.ys xyr ser lie Asp Val Glu Ser Phe Asp Lys Leu Tyr Lys Ser Leu 

^« 350 
Met Phe Gly Phe Thr Glu Thr Asn II. Ala Glu Asn ^r Lys xie Lys 

Thr Arg Ala Ser Tyr Phe Ser Asp ser Leu Pro Pro Val Lys He Lys 

3 S 0 

Asn Leu Leu Asp Asn Glu lie Tyr Thr He Glu Glu Gly Phe Asn He 
ser ASP Lys Asp Mec Glu Lys Glu Tyr Arg Gly Gin Asn Lys Ala He 

410 415 
Asn Lys Gin Ala Tyr Glu Glu Xie Ser Lys Glu His Leu Ala val Tyr 

Lys He Gin Met Cys Lys Ser Val Lys Ala Pro Gly H. ^ ix^ Asp 

val Asp Asn Glu Asp Leu Phe Phe He Ala Asp Lys Asn Ser Phe Ser 

4^0 

ASP ASP Leu ser Lys Asn Glu Arg He Glu T^ Asn Thr Gl„ ser Asn 
Tyr He Glu Asn Asp Phe Pro H. Asn Glu Leu He Leu Asp Thr Asp 
Lys He Glu Leu Pro Ser Glu «sn qIu Ser Leu Thr 



495 

Leu lie ser Lys He Glu Leu Pro Ser Glu Aan Thr Glu Ser 

510 

ASP Phe Asn val Asp Val Pro Val Tyr Glu Lys Gin Pro Ala He Lys 

520 



Lys He Phe Thr Asp Glu Asn Thr XI. Phe Gin Tyr Leu Tyr Ser Gin 

540 



Thr Phe Pro Leu 
S45 



(2) INFORMATION FOR SEQ ID NO: 25: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2616 base pairs 

(B) TYPE: nucleic acid 

^ (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1.. 261 6 



15 



30 



35 



45 



50 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

ATG CAG TTC GTG AAC AAQ CAG TTC AAC TAT AAr ni^n r^r^ . 
«et Oln Phe V.l Asn Lys Gin Phf 2?° ^* 

SI 25 S ^ JIJ K ^ S S?5 ^ IIS 

s? SI K ?i: Si - SI s gi 

^ S ?S iSS =^ SI §?; 25 S 5S Si 

fSJJ: SJS SI? 5S S J2 2J IS - s s 



S S ?2 25 S ^ J.-? S S J£ n? 5S 



48 



96 



144 



192 



240 



335 



125 

GTT ATT GAC ACT AAC TGC ATT AAC GTG ATr raa r-r-* 

val lie ASP Thr Asn Cy. lie ^ IS 

135 3^40 

AGA TCT GAA GAA CTT AAC CTC GTA ATC ATC GGG crn Trr r^nn r^^r. 
Arg Ser GXu Glu Leu Asn Vai ll^ ^ ^= ATT 480 

ISO 1S5 

XiS 12 S ^ ill 5S 5S ^ S JJ^ - 
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25 



30 



35 



40 



205 

1^ s i?^ s s s ;s ^ 2S - - 

g ?K IS 25 sr s 2s ?jj ;n S ?n jss - 

235 240 
CGC GTG TTC AAO GTT AAC ACC AAC GCC TAC TAC «Ar arr- nr-r. 
Arg V*I Ph« Val A,n Thr Asn A?i Tyr ^ jJI? ^ ^ 

GAA GTA AGC TTC GAG GAA CTG CGC ACQ Trr nnv ^ 

OXu V31 ser P.e OXu .Xu Z.eu f| ^fj 



2«5 270 
TTT ATC GAC AGC TTG CAG GAG AAC GAG Trr fn-r r-,-^ 
PH. XI. s,r Lau cm Glu ij„ ^ g| 3^ 

AAG TTT AAA GAT ATT GCA AGT ACA CTG AAC AAf3 r-rr jia/- T^r* 

Lys Phe Ly. A.p He Ala Ser Thr jJ^J 2?° J,'^ ^jl^ 



300 



335 



AAA TTT GAT AAG TTA TAC AAA ATG TTA ACA GAG ATT Tao i^nr. 
Ly. Phe A«p .ys Leu Tyr X.y» „ef Leu ^ ?g fJJ 



350 



GCA TTA AAT GAT TTA TGT ATC AAA GTT AAT AAT TGG GAC TTG TTT TTT 
Ala L|u Asn Asp I-eu Cys lie Lys Val Asn Aan ?S ^ SI JSI 

455 4gQ 



816 



664 



912 



315 

TAT CTC CTA TCT GAA GAT ACA TCT GGA AAA TTT TCQ rrji nn-r ™* 

Tyr Leu Leu ser Gl« Aep Thr Ser |g ^ SIJ^ 2?^ 



1056 



-160 

s sf s ti: is fs |s sf j;i 
SI ^ 0=5 ^; ?s Si ;n isi jj; S xi ^ ?£ S 



1248 



45 410 415 

AAA AAT TTT ACT GGA TTG TTT GAA TTT TAT AAfs nn*^ ny. „ 

Ly, Asn Phe Thr Gly te« Phe Gl« Phi ™J 2^ S ^ ^? ^ 

^ 425 430 ^ 

GGG ATA ATA ACT TCT AAA ACT AAA TCA TTA GAT AAA CGA Tj^n kkt x»« 

Gly He lie Thr Ser Ly- Thr Ly. Ser JiJS ^'44 



X392 



112 
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15 



40 



?r iu* J"" °AA OCA GAA CSAA AAT ATT ACT tt^ 

n« Thr Ser Asp Thr A«n lie Glu Ala Ala GIu ^ s« 



s.' ?f A-j c^;^ ?s gjj S ?s 



670 



ATA CCT GTA TTA GOT ACT TTT GCA CTT GTA TCA TAT ATT rnr tlkt K^r^ 
lie Pro val Leu Oly Thr Phe Ala Leu Va? S« ^ Sl2 ^l^^ 

45 ess 

s s SI s; js ;s - i^- SI s S2 !IS s; 



1488 



GAT pA ATA CAA CAA TAT TAT TTA ACC TTT AAT TTT GAT AAT r»a nn^ 

ASP teu lie Gin Gin Tyr Tyr Thr ^ ^ ^CT isafi 



GAA AAT ATT TCA ATA GAA AAT CTT TCA AGT GAG ATT Axa oro 

01« Asn lie ser He Glu Asn Leu Ser ^JI f,^ "34 

525 

GAA CTT ATG CCT AAT ATA GAA AG A ro-n - 

Glu .eu Met Pro Asn Xle ^ ^ 2^ §JS 



20 "° 555 S60 

AGG 
Arg 

5«5 S70 
^ 585 590 



1824 



1872 



1920 



TTA AAT ATA GOT AAT ATG TTA TAT AAA GAT GAT TTT r-ra nr^ r^r^ ^» 
Leu Asn lie Gly Aj« Met Leu Tyr ^ S^J SI? ^ ^ Su 

«*5 650 

ATA TTT TCA GGA GCT GTT ATT CTQ TTA GAA TTT aTA ^*/* *™ 

lie Ph. ser Oly Ala Val He S[I ?S ^ 



2064 



2112 



AAA TGG GAT GAG GTC TAT AAA TAT ATA GTA ACA aat -rnn tth r./-K 
Ly. Trp Aep Glu Val Ly. Tyr He ^ '^l 

715 720 
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10 



20 



25 



30 



35 



40 



45 



750 

CAA TAT ACT GAG GAA GAG AAA AAT AAT att 

Gin ry. r.r O.u O.u O.u ^ ^ ^^n^ SI 

TTA AGT TCG AAA CTT AAT GAG TCT ATA AAT ana nn-r* 

Leu S« S„ .y. .eu Asn OXu S JJI J^^J 

780 



(2) INFORMATION FOR SEQ ID NO: 26: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 872 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ir) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 



2256 



2304 



2352 



2448 



AAT AAA TTT TTG AAT CAA TGC TCT GTT TCA TAT TTA att h^t .*v^ 
A.n Phe Leu A»n Gin Cys Ser Val ^ 
t5 795 8QQ 

ir, ^ ^ SI s; s is s «; 

840 845 
860 

ACA TTT ACT GAA TAT ATT AAG TAA 

Thr Phe Thr Glu Tyr lie Lys • 2616 
865 S70 



50 



55 



114 



EP 0 939 818 B1 



Met Qln Pbe Val A»n Ly, em Phe Asn Tyr Lya A»p Pro Val Aan aiy 



Val Asp rie Ala Tyr lie Ly. lie Pro Asn Ala Gly Gin Met Oln Pro 



30 



val Lys Ala Phe Ly. He Hia Asn Lys He Trp Val II. Pro Olu Arg 
Asp Thr Phe Thr Asn Pro Glu Glu Qly Asp Leu Asn Pro Pro Pro Glu 



10 50 55 

Ala LYS Gin Val Pro VaX Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 

75 80 
Asp Asn Glu hya Asp Asn Tyr Leu l.y» Gly Val Thr Lys Leu Phe Glu 
15 PO 95 

Arg He Tyr Ser Thr Asp Leu Gly Arg Mec Leu Leu Thr Ser lie Val 
J-vJO 105 ixo 



20 



25 



30 



35 



40 



45 



50 



55 
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Arg Oly lie Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu Leu Lys 

120 125 
Vai lie Asp Thr Asn Cys lie Asa Val lie Gin Pro Asp Gly Ser Tyr 

Ar| ser Glu Glu Leu Asn Leu Val He He Gly Pro Ser Ala Asp He 

He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 

170 

Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 



190 



Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 

^" 200 205 

Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 

" 220 
Leu He His Ala Gly HI. Arg Leu Tyr Gly He Ala He Asn Pro Asn 

Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 

2S0 2S5 

Glu Val ser Phe Glu Glu Leu Arg Thr Phe Gly Gly Hi. Asp Ala Lys 

2S5 270 
Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 

Lys Phe Lys Asp He Ala S|r Thr Leu Asn Lys Ala Lys Ser He Val 

Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
" 31S 320 

Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 

330 

Lys Phe ASP Lys Leu Tyr Lys Met Leu Thr Glu He Tyr Thr Glu Asp 

345 350 

Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 
J»» 360 26S 

Phe Asp Lys Ala Val Phe Lys He Asn He Val Pro Lys Val Asn Tyr 

■»'" 375 380 

Thr He Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 

395 400 

Phe Asn Gly Gin Asn Thr Glu He Asn Asn Met Asn Phe Thr Lys Leu 

410 

Lys Asn Phe Thr Gly Leu Phe Olu Phe Tyr Lys Leu Leu Cys Val Arg 

425 

Gly He lie Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 



Ala Leu Asn Asp Leu Cys lie Lys Val Asn Asn Trp Asp Leu Phe Phe 



445 

460 
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S« Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Qlu 



47S 



460 



lie Thr ser Asp Thr Asn He Glu Ala Ala Glu Glu Asn lie Ser Leu 
*a5 „g 

Asp Leu lie Gin Gin Tyr Tyr Leu Thr Phe A«n Phe Asp Asn Glu Pro 

505 

Glu Asa lie Ser He Glu Asn Leu Ser Ser Asp He He Gly Gin X,eu 

520 

Glu Leu Met Pro Asn He Glu Arg Phe Pro Asn Gly Lya Lys Tyr Glu 



S40 



Leu ASP Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gin Glu Phe Glu 

555 

His Gly Lys Ser Arg He Ala Leu Thr Asn Ser Val Asn olu Ala Leu 

S70 

Leu Asn Pro Ser Arg Val Tyr Thr Phe Ph. Ser Ser Asp Tyr Val Lys 

585 

Lys val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val olu 

600 5Q5 

Gin Leu val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val S.r Thr Thr 

6X5 g20 

Asp Lys He Ala Asp lie Thr He He He Pro Tyr He Gly Pro Ala 
Leu Asn He Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu 



65S 



He Phe Ser Gly Ala Val He Leu Leu Glu Phe He Pro Glu He Ala 

665 

He Pro val Leu Gly Thr Phe Ala Leu Val Ser Tyr He Ala Asn Lys 

685 

val Leu Thr Val Oln Thr lie Asp Asn Ala Leu Ser Lys Arg Asn Glu 



700 



Lys Trp Asp Glu Val Tyr Lys Tyr He Val Thr Asn Trp Leu Ala Lys 

71S 720 

Val Asn Thr Gin He Asp Leu He Arg Lys Lys Met Lys Glu Ala Leu 

725 730 ' 735 

Glu Asn Gin Ala Glu Ala Thr Lys Ala He He Asn Tyr Gin Tyr Asn 

745 750 

Gin Tyr Thr Olu Glu Glu Lys Asn Asa He Asn Phe Asn He Asp Asp 
755 760 755 

Leu Ser Ser Lys Leu Asn Glu Ser He Asn Lys Ala Met He Asn He 
' 775 7gQ 

Asn Lys Phe Leu Asn Gin Cys Ser Val Ser Tyr Leu Met Asn Ser Met 

795 800 

He Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu Lys 
BOS 810 815 
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ASP Ala Leu Leu tya Tyr He Tyr Asp Asn Arg Gly Thr Leu He Gly 

GI. val ASP Arg Leu Lys A«p Lys Val As„ As„ Thr Leu Ser Thr Asp 

840 



lie Pro Phe Gin Leu Ser Lys Tvr Val ac« i-i 

850 ^ ^^"^ Arg Leu Leu Ser 



860 



Thr Phe Thr Glu Tyr lie Lys * 
865 870 

(2) INFORMATION FOR SEQ ID NO: 27: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2574 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
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15 



20 



25 



30 



ATGCCGGTTA CCATCAACAA CTTCAACTAC AACOACCCGA TCGACAACAA CAACATCATC 
AXXIATGGAAC CGCCGTTCGC ACGTGGTACC GGTCGTTACT ACAAGGCTIT CAAGAITACC 
GACCGTATCT GGATCATCCC GGAACGTTAC ACCTTCGGTT ACAAACCTOA GGACTTCAAC 
AAGAGTAGCG GGATTTTCAA TCGTCACGTC TGCGAGTACX ATGATCCAOA rTATCTGAAT 
ACCAACGATA AGAAGAACAT ATTCCTTCAG ACTATGATCA AGTTATTTAA TAGAATCAAA 
TCAAAACCAT TGGGTGAAAA GTTATTAGAG ATGATTATAA ATGGTATACC TTATCTTGGA 
GATAGACGTG ITCCACTCGA AGAGTTTAAC ACAAACATTG CTAGTGTAAC TGTTAATAAA 
TTAATCAGTA ATCCAGGAGA AGTGGAGCGA AAAAAAGGTA TrXTCGCAAA TTTAATAATA 
ITTGGACCTG GGCCAGTTTT AAATGAAAAT GAGACTATAG ATATAGOTAT ACAAAATCAT 
TTTGCATCAA GGGAAGGCTT CGGGGGTATA ATGCAAATGA AGTTTTOCCC AGAATATGTA 
AGCGTATTTA ATAATGTTCA AGAAAACAAA GGCGCAAGTA TATTTAATAG ACGTGGATAT 

rrrrcAGATc cagccttgat attaatocat gaacitatac atgttttaca tggattatat 

GGCATTAAAG TAGATGATTT ACCAATTGTA CCAAATCAAA AAAAATTTTT TATOCAATCT 
ACAGATGCTA TACAGGCAGA AGAACTATAT ACATTTGGAG GACAAGATCC CAGCATCATA 
ACTCCTTCTA CGGATAAAAG TATCTATGAT AAAGXTTTOC AAAATITTAG AGGGATAGTT 
GATAGACTTA ACAAGGTTTT AGTTTGCATA TCAGATCCTA ACATTAATAT TAATATATAT 
AAAAATAAAT TTAAAGATAA ATATAAATTC GTTGAAOATT CTOAGGGAAA ATATAGTATA 
GATGTAGAAA GTTTTGATAA ATTATATAAA AGCTTAATGT ITOGTTTTAC AGAAACTAAT 
ATAGCAGAAA ATTATAAAAT AAAAACTAGA GCTTCTTATT TTAGTCATTC CTTACCACCA 
GTAAAAATAA AAAATTTATT AGATAATGAA ATCTATACTA TAGAGGAAGG GTTTAATATA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 



40 



45 



50 



55 



119 
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10 



15 



25 



30 



35 



40 



50 



TCTGATAAAG ATATGGAAAA AGAATATAGA GGTCAGAATA AAGCTATAAA TAAACAAGCT 
TATGAAGAAA TTAGCAAGGA GCATTTGGCT GTATATAAGA TACAAATGTG TAAAAGTCTT 
AAAGCTCCAG GAATATGTAT TGATGTTGAT AATGAAGATT TGTTCTTTAT AGCTGATAAA 
AATAGTTTTT CAGATGATTT ATCTAAAAAC GAAAGAATAG AATATAATAC ACAGAGTAAT 
TATATAGAAA ATGACTTCCC TATAAATGAA TrAATTTTAG ATACTGATTT AATAAGTAAA 
ATAGAATTAC CAAGTGAAAA TACAGAATCA CTTACTCATT TTAATGTAGA TCWCCAGTA 
TATGAAAAAC AACCCGCTAT AAAAAAAATT TTTACAGATG AAAATACCAT CTTTCAATAT 
ITATACTCTC AGACATTTCC TCTAGATATA AGAGATATAA GTTTAACATC TTCATTTCAT 
GATGCATTAT TATTTTCTAA CAAAGITTAT TCATTTTTTT CTATGGATTA TATTAAAACT 
GCTAATAAAG TGGTAGAAGC AGGATTATTT GCAGGTTGGG TGAAACAGAT AGTAAAX^T 
rrTGTAATCG AAGCTAATAA AAGCAATACT ATGGATAAAA TTGCAGATAT ATCTCTAATT 
GTTCCTTATA TAGGATTAGC nTAAATGTA GGAAATGAAA CAGCTAAAGG AAATTTTGAA 

AATGCTrrrc agattocagg agccagtatt CTACTAGAAT TTATACCAGA acttttaata 

CCTGTAGTTO GAGCCTTTTT ATTAGAATCA TATATTGACA ATAAAAATAA AAITATTAAA 

acaataoata atgctttaac taaaagaaat cjaaaaatgga gtgatatgta cggattaata 

GTAGCGCAAT GGCTCTCAAC AGTTAATACT CAATTTTATA CAATAAAAOA GGGAATGTAT 
AAGGCTTTAA ATTATCAAGC ACAAGCATXX3 GAAGAAATAA TAAAATACAC ATATAATATA 
XATTCTGAAA AAGAAAAGTC AAATATTAAC ATCGATTTTA ATGATATAAA TTCTAAACTT 
AATGAGGGTA TTAACCAACC TATAGATAAT ATAAATAATT TTATAAATGG AIXTTTCrGTA 
TCATATTTAA TGAAAAAAAT GATTCCATTA GCTGTAGAAA AATTACTAGA CTTTCATAAT 
ACTCTCAAAA AAAATTTGTT AAATTATATA GATGAAAATA AAITATATTT GATTGGAAGT 
GCAGAATATG AAAAATCAAA AGTAAATAAA TACTTGAAAA CCAITATGCC GTTraATCTT 
TCAATATATA CCAATGATAC AATACTAATA GAAATGTTTA ATAAATATAA TAGC 

(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2574 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 



1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2574 
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ATGCCAGTTA CAATAAATAA TTTTAATTAT 
ATGATCGAGC CTCCATTTGC GAGAGGTACG 
^ GATCGTATTT GGATAATACC GGAAAGATAT 

AAAACTTCCG 6TATTTTTAA TAGAGATGTT 

10 



15 



20 



25 



30 



35 



40 



45 



50 



AATGATCCTA TTGATAATAA TAATATTATT €0 

GGGAGATATT ATAAAGCTTT TAAAATCACA 120 

ACTTTTGGAT ATAAACCTGA GGATTTTAAT IBO 

TGTGAATATT ATGATCCAGA TTACTTAAAT 240 
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ACTAATGATA AAAAGAATAT ATTTTTACAA ACAATGATCA AGTTATTTAA TAGAAXCAAA 
TCAAAACCAT TGGGTGAAAA GTTATTAGAG AltSATTATAA ATGGTATACC ITATCTTGC;, 
GATAGACGTG TTCCACTCGA AGAGTTTAAC ACAAACATTG CTAGTOTAAC TCTTAATAAA 
TTAATCAGTA ATCCAGGAGA AGTGGAGCGA AAAAAAGGTA TTTTCGCAAA TTTAATAATA 
TTTCGACCTG GGCCAGTTTT AAATCAAAAT GAGACTATAG ATATACGTAT ACAAAATCAT 
TTTGCATCAA GGGAAGGCTT CGGGGGTATA ATGCAAATGA AGTTTTGCCC AGAATATGTA 
AGCGTATTTA ATAATOTTCA AGAAAACAAA GGCGCAAGTA TATITAATAG ACGTGGATAT 
TTTTCAGATC CAGCCTTGAT ATTAATGCAT GAACTCATCC ACGTCCTCCA CGGTCTCTAC 
GGTATCAAAG TAGACGACCT CCCGATCGTC CCGAACGAAA AAAAATTCTT CATOCAGAGC 
ACCGACGCAA TCCAGGCAGA AGAACTCTAC ACCTTCGGTG GTCAGGACCC GAGCATCATC 
ACCCCGAGCA CCGACAAAAG CATCTACGAC AAAGTCCTCC AGAACTTCCG TGGTATCGTC 
GACCGTCTCA ACAAAGTCCT CGTCTGCATC AGCGACCCGA ACATCAACAT CAACATCTAC 
AAAAACAAAT TCAAAGACAA ATACAAATTC GTCGAAGACA GCGAAGGTAA ATACAGCATC 
GACGTCGAGA GCTTCGACAA ACTCTACAAA AGCCTCATOT TCGGTTTCAC CGAAACCAAC 
ATCGCAGAAA ACTACAAAAT CAAAACCCGT GCAAGCTACT TCAGCGACAG CCTCCCGCCG 
GTCAAAATCA AAAACCTCCT CGACAACGAA ATCTACACCA TCGAAGAAGG rTTCAACATC 
AGCGACAAAG ACATGGAAAA AGAATACCGT GGTCAGAACA AAGCAATCAA CAAACAAGCT 
TACGAAGAAA TCAGCAAAGA ACACCTCGCA GTCTACAAAA TCCAGATOTO CAAAAGCGTC 
AAAGCACCGG GTATCTGCAT CGACGTTGAC AACGAACACC TCTTCTTCAT CGCAGACAAA 
AACAGCTTCA GCGACGACCT CAGCAAAAAC GAACGTATCG AATACAACAC CCAGAGCAAC 
TACATCGAAA ACGACTTCCC GATCAACGAA CTCATCCTCG ACACCOACCT CATCAGCAAA 
ATCGAACTCC CGAGCGAAAA CACCGAAAGC CTCACCGACT TCAACGTTGA CQTCCCGGTC 
TACGAAAAAC AGCCGGCAAT CAAAAAAATC TTCACCGACG AAAACACCAT CTTCCAGTAC 
CTCTACAGCC AGACCTTCCC GCTAGATATA AGAGATATAA GITTAACATC TTCATTTOAT 
GATGCATTAT TATTTTCTAA CAAAGTTTAT TCA li ' lTilT CTATGGATTA TATTAAAACT 
GCTAATAAAG TGGTAGAA6C AOGATTATTT GCAGGTTGGG TGAAACAQAT AGTAAATGAT 
TTTCTAATCG AAGCTAATAA AAGCAATACT ATGGATAAAA TTGCAGATAT ATCTCTAATT 
GTTCCTTATA TAGGATTAGC TTTAAATGTA GGAAATGAAA CAGCTAAAGG AAATTTCGAA 
AATGCTTTTG AGATTGCAGG AGCCAGTATT CTACTAGAAT TTATACCAGA ACTTTTAATA 
CCTGTAGnrG GAGCCTTTTT ATTAGAATCA TATATTGACA ATAAAAATAA AATTATTAAA 
ACAATAGATA ATGCTTTAAC TAAAAGAAAT GAAAAATGGA GTOATATGTA CGGATTAATA 
GTAGCGCAAT GGCTCTCAAC AGTTAATACT CAATTTOATA CAATAAAAGA GGGAATGTAT 
AAGGCTTTAA ATTATCAAGC ACAAGCATTG GAAGAAATAA TAAAATACAG ATATAATATA 
TATTCTGAAA AAGAAAAGTC AAATATTAAC ATCGATTTTA ATGATATAAA TTCTAAACTT 



300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
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AATGAGGGTA 


TTAACCAAGC TATAGATAAT ATAAATAATT 


TTATAAATGG 


ATGTTCTGTA 


2340 


TCATATTTAA 


TGAAAAAAAT GATTCCATTA GCTGTAGAAA 


AATTACTAGA 


CTTTGATAAT 


2400 


ACTCTCAAAA 


AAAATTTGTT AAATTATATA GATGAAAATA 


AATTATATTT 


GATTGGAAGT 


2460 


GCAGAATATG 


AAAAATCAAA AGTAAATAAA TACTTGAAAA 


CCATTATGCC 


GTTTGATCTT 


2520 


TCAATATATA 


CCAATGATAC AATACTAATA GAAATCTTTA 


ATAAATATAA TAGC 


2574 



Claims 



1. A polypeptide composing first and second domains, wherein 

2. A polypeptide according to Claim 1 wherein the clostridial toxin is a botuiinum toxin. 

4. A»«,,p,p,de.ccortin9BCI.™i^„«»=te„„to*h«v,otai„H„„o«i»|,^ 

6. A polypeptide according to Claim 5 wherein said third domain is for binding the polypeptide to an immunoglobulin. 
" JeS^dTrd^^^^^^^^ ^ — synthetic ,G binding domain 

tTs!SlTc:'^T' '° ' ^^"^ '''' --P'*-^ - -cuence that binds to a 

9. A polypepOde according to Claim 8 wherein said third domain is insulin-like growth factor-1 (IGF-1 ). 
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ImoTtnH inr^M?'"' " 'yP« B "St^' Chain, or a fragment or variant 

thereof, and 107 N-terminal ammo acids of a botulinum toxin type B heavy chain. 

Type SlTc'ar"' '° """" " °' bot"«-m toxin 
arr^SLrmtxl^^^^^^^ ^ ^'"""""^ ^ «3 N-tem,ina, amino 

agZ^TresSleSS^ comprising a botulinum toxin type A light chain variant wherein residue 2 is 

toxtC 'a heaCchafn " ' ^""^ « Botulinum 

B Svy'cJIarn'''"'"'"' ^^""'"'"^ °f Botulinum toxin type 

acrdroTaTo^rmtxr;^^^^^^^ ' •^''^ ^ '^^^ -termina, amino 

21. A polypeptide according to any of Claims 10-20 lacking a portion designated of a botulinum toxin heavy chain. 

23. A polypeptide according to any preceding claim comprising a variant of a clostridial toxin and further comorisina 
a s,te for cleavage by a proteolytic enzyme, which cleavage site is not present in the native iol '=°^P"^'"9 

24. A polypeptide according to Claim 23 comprising a variant of a clostridial toxin light chain and further comorisinc a 
s,te for cleavage by a proteolytic enzyme, which cleavage site is not present in the nativ^ to^ntht S 

25. A polypeptide according to Claim 23 or 24 comprising a variant of a clostridial toxin heavv chain H nortinn «„h 

"•a?t:ri™r.ir^^^^^ 

nlTH" P™*^'" ^°'^P';'«'"9 ^ °f (a) a polypeptide according to any of Claims 1-26 with (b) a second polvDeo 

28. A fusion protein according to Claim 27 wherein said second polypeptide binds to a chromatography column. 
i^^::XoT^'"' ^^'^ Chromatography column is an affinity matrix of glu- 

31. Acompositioncomprisingapolypeptide according toany preceding claim,said composition being non-toxic,>,Wvo. 
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e«p»bl.cart«-. ^"™"'™'"'27.28.29i>r30.mcombinalic».wUbaph.rmaceu«callyao 
M.AnuaaloaadeocMlnaa polypaplWa or a fusion pnaoln aocortng lo any of Claim. 1 -30. 

X'sra:",r •o,7ar«:c°s ss"r„"*» " °* - - 

43. A DNA according to any of Claims 36-42. 

44. A DNA selected from SEQ ID No:s 1, 3, 5, 9, 11, 13, 15. 17. 19. 21. 27 and 28. 

the fusion protein through an affinity matrix aZtLd to rT,"„,H , ' ^ ^'"^ ""^ «'"«"9 

llgand adapted to displace the iZT^ts^reZ^Z Te S ^'"'"^ ^^''^ '"^'^'^ ^ 

47. A method Of manufacture according to Claims 45 or 46 in which the nucleic acid is DNA. 

48. A cell expressing a polypeptide or fusion protein according to any of Claims 1-30. 

PatentanspiUche 

1. Polypeptid, umfassend erste und zweite Domanen 

™™„b.„asaoz,,«o p„»ino, «a «,rES^;tL^^^ 
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worin das Polypeptid ein Einzelkettenpolypeptid ist. 
2. Polypeptid nach Anspruch 1 , worin das Clostridientoxin ein Botulinustoxin ist. 

6. Polypeptid nach Anspruch 5. worin die dritte DomSne fOr die Bindung des Polypeptids an ein Immung.obulin ist. 

a^Erd^rosr^rst^^^^^^ 

n.^Ez^X'^rdT'^'' '^'"^ ^"'"^"^ ^'"^ Aminos^u-Bsecuenz un,fasst. die an einen Zaiiober- 

gShSJ;:^itt"''~''''"°'"''^''"^°°^^ ('FCB-1 =insu,in-,iKe 

SS™— — ^^^^^^ 
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■ EEs S ^"^^"^'^ ''^'^ ^'^ He-bezeichneter Tei. einer schweren Ketta eines Botu- 

22. Polypeptid nach einem der Anspr^che 1^, worin die zweite Do.,ane nich. an Zel.oberflachenrezeptoren binden 

28. Fusionsprotein gemas Anspruch 27. worin das zweite Polypeptid an eine ChromatographiesSule bindet. 

epharoStr ''^ ^'^^"^ Chro.atographies.ule eine Affinitatsmatrix von Glutathlons- 
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pharmazeutisch annehmbaren Trager. 

36. Nukleinsaure, kodierend ein Polypeplid Oder ein Fusionsprotein nach einem der AnsprOche 1-30. 

37. Nukleinsaure nach Anspruch 36, kodierend ein Polypeptid Oder ein Fusionsprotein und umfassend Nukleotide 
kodierend die Reste 1 -448 einer leichten Kette eines Botulinustoxins Typ A. "'"'assend Nukleotide. 

38. Nukleinsaure gemaii Anspruch 36 Oder 37. umfassend Nukleotide. kodierend die Reste 1-423 einer H.-Domane 
der sch weren Kette eines Botulinustoxins Typ A. ^ uomane 

39. Nukleinsaure nach Anspruch 36, kodierend ein Polypeptid oder ein Fusionsprotein und umfassend Nukleotide 
kodierend die Reste 1-470 der leichten Kette eines Botulinustoxins Typ B. umrassend Nukleotide, 

ktiSrf ^"T'^ Polypeptid Oder ein Fusionsprotein und umfassend Nu- 

kleotide, kodierend die Reste 1-417 einer H^-Domane der schweren Kette des Botulinustoxins Typ B 

s'ltpSstrdT^^^^ Restriktionsendonuclea- 
sespaitstelle. die in der nativen Sequenz des Clostridientoxins nicht vorhanden ist. 

F^^on^^nT^^^'^'^ ^1' """'"^ Modifikation eines Nukleotids. kodierend ein Polypeptid Oder 

Fusionsprotein nach einem der Anspriiche 1-30. um die Spaltstelle einzufiihren. 

43. DNS nach einem der AnsprQche 36-42. 

44. DNS, ausgewdhit unterden SEQ IDs der Nummem 1, 3, 5, 9. 11, 13, 15, 17. 19, 21. 27 und 28. 

45. Ve^hren zur Heratellung eines Polypeptids nach einem der Anspruche 1-26. umfassend das Exprimieren einer 
Nukleinsaure nach einem der Anspruche 36-44 in einer Wirtszelle und die Gewinnung des PolypeptSr 

46. Verfahren der Herstellung eines Polypeptids nach einem der AnsprOche 1-26. umfassend das Exprimieren einer 
Nukleinsaure, kodierend ein Fusionsprotein nach Anspruch 27. 28, 29 oder 30 in einer WirtszeTe drReinigen 
?n ? f="«0"sproteins durch eine Affinitatsmatrix, angepasst an das ZurOckhal- 
angepasst ist, und das Gewinnen des Fusionsproteins. rusionsproieins 

47. Verfahren zur Herstellung nach Anspruch 45 oder 46, worin die Nukleinsaure DNS ist. 

48. Zelle, exprimierend ein Polypeptid oder Fusionsprotein nach einem der Anspruche 1-30. 

Revendications 

1. Polypeptide comprenant un premier et un second domaines, dans lequel 

UHite rhin^iT'r '^T^'"^ ""^ ""^^'"^ '^3^'^ ''^ neutoroxine clostridienne ou un fragment ou un variant de 
prlines assoSr^ ^ nau,oro;«ne clostridienne, .edit premier domaine 6tant capable de diver une ou pSurJ 
proteines associees S une membrane vesiculaire ou plasmique et essentielles pour Texocytose 

ledit deuxieme domaine est une portion de chaine lourde de neurotoxine clostridienne ou un fragment 
ou un variant de la d,te portion de chaine lourde de neutoroxine clostridienne, ledit deuxieme doma ne San 
capable de 1) transloquer le polypeptide dans une cellule ou (ii) accroTtre la solut^ilite du poVpeptJe en compa 
raison ayec la so ubH.t6 du premier domaine a lui seul ou (iii) a la fois transloquer le polypeptKans une Ste 
et accroitre la solubility du polypeptide en comparalson avec la solubilile du premier dorSafne I lui seuT 
me dori^aine r ' '''' "'""^ '^^'"^ neurotoxine clostridienne etant absente dudIt deuxi^- 

ledit polypeptide est un polypeptide a chaTne unique. 

2. Polypeptide selon la revendication 1 , dans lequel la toxine clostridienne est une toxine botulinique. 

3. Polypeptide selon la revendication 1 ou la revendication 2. dans lequel le premier domaine poss6de une activite 
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endopeptidase specifique d'un substrat choisi parmi un ou plusieurs de SNAP-25. synaptobr6vine/VAMP et syn- 

TS^S^:^^"'"'^'"' '-^^^ ^« -^'o^trldienne provient 

5. Polypeptide salon I'une quelconque des revendications 1 A 4 comprenant en outre un troisieme domaine llant l« 

^" f inf.f " 6, dans lequel ledit troisieme domaine est un domaine synth6tique de liaison 

S I IgG a sequences repetees en tandem d6rlv6 du domaine beta de la prot6lne A staphylo«.cclque 

SueTSS"; ) «■ '^^"^ '^''"^ ^'""aine est le facteur de croissance insullnomi- 

SuZutT! Tf''""^ quelconque des revendications pr6c6dentes. compr«nar,t une chalne I6g6re de toxine 

11. Polypeptide selon la revendication 10. dans lequel Tun ou les deux parmi (a) la chatne leg^re de toxine ou le 
irernrlfq^er^^^^^^^^^^^ 

12. Polypeptide selon la revendication 11, dans lequel le variant de charne I6g6re de toxine botulinlaue de tvoe A 
possede un glutamate au rdsldu 2. une lysine au r6sidu 26. et une tyrosine au rJdu 27 '^^ ^ 

frlnZT^ f^'°" revendication 10. dans lequel run ou les deux parmi (a) la chatne l^gere de toxine ou le 
aTb^XTdrty^B.^'"^ '""''^ '^^^'^ '--^^ 3on. Znl 

Tu Tfraomlnf n ''""^ revendications 1 ^ 9 comprenant une chatne I6g6re de toxine botulinique 

ou un fragment ou une vanante d'une chaine legere de toxine botulinique et au moins 100 acides amin^l N 
temiinaux d'une chaine lourde de toxine botulinique. ^ ^' 

selon la revendication 14 comprenant une chatne I6g6re de toxine botulinique de type B ou un frao 
Tn^l de";;: B ^'^^ ^"'"^^ -ouS de 'toxme botu: 

16. Polypeptide selon la revendication 11 ou la revendication 12 comprenant au moins 423 des aririP« ^m.nAc m 
terminaux d'une chafne lourde de toxine botulinique de type A. 

TmS? 16 comprenant une chatne I6g6re de toxine botulinique de type A et 423 acides 

amines N-terminaux d'une chatne lourde de toxine botulinique de type A. 

drS!nnlT.rl°"J^ revendication 16 comprenant un variant de chatne I6g6re de toxine botulinique de type A 
«mini! m! 2 est un glutamate. le r6sidu 26 est une lysine et le r6sidu 27 est une tyrosine et 423 acides 

amines N-terminaux d'une chafne lourde de toxine botulinique de type A. tyrosine, et 4Z3 acides 

roSdTt^rso^rqr^^^^ -^^'-^ -'-^^ cname 
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4^?S«V1TnT'"'''''"^^^ comprenant au moins une chatne legere de toxine botulinique de type B et 
41 7 acides aminfes N-terminaux d'une chaTne lourxle de toxine botulinique de type B. 

dtennf 2*^^ des revendicaUons pr6cedentes comprenant un variant d'une toxine clostri- 



TreS'I? " ^evendlcatlon 23 comprenant un variant d'une chatne legere de toxine clostridienne et com- 

--^ ^-^-^ '^"-^e n'est^p"a"s%re:t 

inulrSf 23 ou la revendication 24 comprenant un variant d'une portion H^, de chatne 

Ste dl ctarn-lTo^^^^^^^ et comprenant en outre un site pour le clivage par une enzyme'protilJqueTeqS 
site de clivage n est pas present dans la portion de la chatne lourde de la toxine native. 

SjS?e%\''llrr'' °" ^''^ modification d'un ADN codant pour le 

polypept.de de man.6re d y introduire un ou plusleurs nucteoUdes codant pour le site de clivage. 

Z°J^Z!'.Vr'°- ^°""^f polypeptide selon I'une quelconque des revendications 1 ^ 26 

avec (b) un deuxieme polypeptide qui est un polypeptide ou un oligopeptide qui se lie ^ une matric^ Snit6 de 
man.6re a permettre une purification de la prot6ine de fusion en utilisant ladite matrice 

c^^ratogiX" " 2^' '^^"^ polypeptide se lie . une colonne de 

T^IZZS^r^O^::::::^^'"' ^^"^ '^^"^"^ '^'^ chromatographle est une matrice 

LTi?™ '"''r '^^«^«"<^''=«««" 27. 28 ou 29. dans laquelle un site sp6cifique de clivage par protease 

SZSrnoXM::ro.'^^^^^^^^ •'-^"'''"''"^ -endlcations pr^Cdentes. ladite compo- 

"•d^urarr^"^^^^^^^^ 
^'■^rd=-^^^^^^ 

combinaison avec un support physiologiquement acceptable. ' 
36. Acide^nucl^ique codant pour un polypeptide ou une prot6ine de fusion selon I'une quelconque des revendications 
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J« n Tr'tT^ "^^^"^ """^ polypeptide ou une proline de fusion selon la revendication 36 et comprenant 
des nud^obdes codant pour des r6sidus 1 ^ 448 d'une chatne I6g6re de toxlne botulinique Se type A 

^iTSrfu?!'^!^^^^^^^ '? ^«^«"'^'<=^«°" 37 comprenant des nucleotides codant pour des 

residus 1 d 423 d un domaine d'une chatne lourde de toxine botulinique de type A. 

39. Acide nucleique codant pour un polypeptide ou une proteine de fusion selon la revendication 36 et comorenant 
des nucleotides codant pour des residus 1 a 470 d'une chalne I6gere de toxine botulinique Z ^^f^^'^"^"' 

^f'^^""°'^'<'"f^^°'^^"tP«"^ ""polypeptide ou une proteine de fusion selon la revendication 36 ou la revendication 
bLrq^dTlypre""''''"^^^^ ' ' ''"^ de Chatne lo^r^e tole 

selon rune quelconque des revendications 36 S 40 comprenant des nucleotides codant pour une 
s.te de chvage par endonucl^ase de restriction non present dans la sequence de toxine cloiienS native 

42. Aclde nucleique selon la revendicaUon 41 susceptible d'etre obtenu par modification d'un nucleotide codant oour 
ledirsSS clfv^Se"" ''""^ revendications 1 . 30 de s^rte^fntlr: 

43. ADN selon I'une quelconque des revendications 36 d 42. 

44. ADN choisi parmi SEQ ID Nos 1, 3, 5, 9, 11. 13, 15. 17. 19, 21, 27 et 28. 

hZT^ ."'f !r Polypeptide selon I'une quelconque des revendications 1 d 26 comprenant I'exoression 
polypepre ' '"""^ revendications 36 . ZTi !^S^Z 

S^!^ ?n 'f ,'^*=,^!'f"5"" polypeptide seton I'une quelconque des revendications 1 S 26 comprenant I'expression 

^ 30 "a pulfiStS^'de';: n^tf' ""TT '^"^ ^'^ revenrat^ T'Ts 29 

P""floation de la proteine de fusion par elution de la prot6ine de fusion d travers une matrice d'affinit6 

47. Proc6d6 de fabrication selon les revendications 45 ou 46, dans lequel I'acide nucleique est de I'ADN. 

48. Cellule exprimant un polypeptide ou une prot6ine de fusion selon I'une quelconque des revendications 1 d 30. 
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Protein concentration (ng/ml) 

FIG. 3 
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Panel A. Panel B. 
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